kbarbary / sep

Python and C library for source extraction and photometry
183 stars 57 forks source link

sep_version_string mismatch #111

Closed RReverser closed 3 years ago

RReverser commented 3 years ago

It looks like sep_version_string wasn't updated in a while (5 years, to be more precise), even though there has been a bunch of C library changes since.

Is it still meant to be kept up-to-date? Github shows version 1.1.1, which it takes from Python wrapper releases - maybe sep_version_string should just jump all the way to that number too to keep those in sync?

Alternatively, should it just get updated to 0.7.0 (perhaps after outstanding PRs are merged)?

Either approach would be fine, just wanted to bring this up to make sure that the version gets updated between changes - otherwise it makes hard to depend on it in 3rd-party apps / libs.

RReverser commented 3 years ago

FWIW for the published Rust bindings I've used the Github version - 1.1.1 - but it was because I didn't realize at the time that C part of the repo is versioned separately.

RReverser commented 3 years ago

@kbarbary Would also love your thoughts here too - at least the changes from #109 will require a version bump, would love to understand what it would bump to :)

kbarbary commented 3 years ago

I think the original intent was to keep the two versions in sync - it's an oversight that they're out of sync. However, this does make the C library version number less meaningful in the semantic versioning sense - a breaking change or bugfix to the Python API is not necessarily the same for the C library. I wasn't sure how much use the C API was going to get, so I didn't want to prematurely overcomplicate things maintaining two version numbers. We can definitely revisit that if needed, but for now it's probably easiest to update the versions to be the same in #109 or a separate PR. I think that version would be 1.2.0.

RReverser commented 3 years ago

Sounds reasonable to me. I guess the easiest is if you bump both versions to be in sync during the next Python release.

mgreter commented 3 years ago

IMO it is very reasonable to have different versions for C-API (ABI) and the actual software version(s), it's what we e.g. have done in LibSass since a few years. I would really like to see SEP being a true shared library, but I honestly also don't have enough time to get another side project on my plate :-/ But I hope you don't mind if I share some insights from my work on LibSass.

To be a true shared library (e.g. installed as /usr/lib/libsep.so) you probably want to slice the current repo in three pieces. The actual SEP library, a well defined C-API (with a separate ABI version) and the python bindings. If you version the ABI this way, you can just update the shared library on your system and all implementors/bindings will just pickup the latest version. But when you bump the major version of the ABI every now and then, all consumers need to be recompiled too.

On another side-note, if SEP ever goes the route of being a true shared library, I would suggest to only expose functions on the C-API and return pointers to anonymous structs, which consumers then may pass to other functions. The less you expose to the consumer, the easier it is to change implementation details without breaking binary contract (you may expose certain structs directly on the C-API headers, but IMO even python discouraged this in favor of functional APIs). You can easily add new functions without a major ABI bump needed, but you can't e.g. easily change existing function parameters (this would need a major ABI version bump).

Ultimately to solve the threading issue you probably want to e.g. have a C-API workflow like (pseudo code only):

struct SepEngine* sep = sep_create_engine();
sep_set_option_xy(sep, foobar);
struct SepImg* img = sep_create_img(...);
struct SepBkg* bkg = sep_create_bkg(...);
int status= sep_extract_bkg(sep, img, ..., bkg);
status = sep_bkg_subarray(sep, bkg, img);
struct SepCatalog* catalog = sep_create_catalog(...);
status = sep_extract(sep, img, bkg, catalog);
for (int i=0; i < sep_catalog_size(catalog); i++) {
  double x = sep_catalog_get_x(catalog, i);
  double y = sep_catalog_get_y(catalog, i);
  float flux = sep_catalog_get_flux(catalog, i);
}
sep_free_bkg(bkg);
sep_free_img(img);
sep_free_catalog(catalog);
sep_free_engine(sep);

You ideally want to have those static structs (the ones @RReverser changed to thread_local in the other PR, BTW. props for that) to be attached to struct SepEngine*. This way the consumer can bring up as many instances/engines as he likes, you just need to pass around that pointer for each and every function call that needs to access those shared structures. I know this is not the easiest task, I actually tried to do the changes before for SEP, but failed. The thread_local approach probably works for multi-threading, but be aware they have their own pitfalls. They have similar initialization order issues as other statics (as I found out the hard-way when implementing a thread-local custom memory allocator). IMO as long as you only use PODs you should be on the safe side. Also using thread_local will probably only compile with gcc ~4.7+ (you may get away with gcc 4.4 and c++0x or gnu++0x flag; but it certainly has implications for compilation really old systems).

Again, hope you don't mind that I hijacked this thread a bit, but felt encouraged by the progress @RReverser made, to at least document what my finding and thoughts were on my failed approach to progress in this direction, in the hope this project may profit a little bit off of it in the future. I might even give it another try, if my plate ever gets a bit emptier 😃

Thanks and keep up the good work!

RReverser commented 3 years ago

You ideally want to have those static structs (the ones @RReverser changed to thread_local in the other PR, BTW. props for that) to be attached to struct SepEngine*.

Sure, I agree that would be even better, but I've been trying to make sure I'm keeping backward-compatibility with existing API consumers. Also, making those changes requires even more care not to break anything in the process, so prefer to make progress by making one safe-ish step at a time :) I have some ideas for few more low-hanging fruits, but want that PR to be merged first.

Either way, as you said, threading issues are not entirely relevant to this particular thread.

RReverser commented 3 years ago

Also on:

Also using thread_local will probably only compile with gcc ~4.7+ (you may get away with gcc 4.4 and c++0x or gnu++0x flag; but it certainly has implications for compilation really old systems).

I did describe we need C11 compatible compiler. Note that it's a C library not C++, so c++0x flags as well as mentioned ordering issues from C++ reference don't apply - it's all PODs :)

kbarbary commented 3 years ago

I believe the version string mismatch is fixed now!