Closed kbarbary closed 5 years ago
:+1: to having C code for core numerical algorithms where we get an order of magnitude speedup and more readable code.
We already have Cython extensions, so Cython-wrapped C doesn't make photutils
harder to build and there's enough Astropy-devs that know C, so maintaining that code should not be an issue.
The main issues I see with sep
is that at the moment it's GPL licensed ... you'd have to change to BSD before we can use it in photutils
.
@kbarbary You mention you'd like to keep sep
as a separate repo, as a C library that can be used in other places and I agree that's the right way to do it.
But it's not clear to me what the best approach is for photutils
:
sep
as an external (optional?) dependency?sep
as a git submodule or via including a copy in cextern
?I think maybe bundling it in photutils/cextern
might be the best option because it's the most convenient for the end-user.
@kbarbary @astrofrog @eteq @bsipocz What do you think?
It looks like a lot of the C code came from SExtractor
, i.e. sep
is GPL because SExtractor
is GPL. We'd likely need to write new C code from scratch (or convince SExtractor
to go to BSD).
SExtractor
just changed from CeCILL
to GPL
recently ... I doubt they'd want to change license again.
But I think the functions @kbarbary would like to implement in C for photutils are very simple and not hard to write from scratch ... @kbarbary do you have interest to do this in photutils
(or another external BSD-licensed package)?
+1 to developing a small core C library that photutils could then provide a higher-level interface to. If we can develop the high-level interface and make sure it is well tested, then it should be easy to substitute out functions later with calls to this C library. This means that we should try and make sure we separate out the 'geometrical' code in photutils from the high-level API (which I've been advocating anyway).
Great to hear that there's some interest in this! There's really two separate issues regardling licenses and libraries that I've unfortunately compounded here:
For aperture photometry, it is simple enough that not a lot is gained by borrowing from SExtractor: we can write out own (small) library and license it however we want.
The more complicated issue is if we want to borrow the source extraction and background subtraction code from SExtractor. These are not simple to recreate (particularly source extraction), and I think they would be valuable to have in a library, which is why I started SEP. They're optimized and memory efficient. The license is a problem and @larrybradley is correct that SEP is GPL licensed because it is a direct adaptation of SExtractor. I didn't know that SExtractor recently relicensed (thanks for the info, @cdeil ) but I don't think we would have to ask SExtractor to relicense again. According to AUTHORS
in SExtractor, Emmanuel Bertin is the only author. We would only need to ask him if we can relicense the specific parts of the code that have been adapted. He may be amenable to at least a LGPL license, which I think would be OK for us?
For my part, I'd be interested in starting and maintaining the C library, hopefully with the advice of others who are C experts - I'm not.
I'm not sure LGPL
would be OK for an "upstream" package that photutils
calls into.
This diagram says it's not, but there are other resources on the web that says they are compatible.
A package with the same BSD license as Astropy and many other Astro packages would be best ... @kbarbary if you contact Emmanuel Bertin, why not ask for that first?
Starting a C library with SExtractor functionality like sep
is a big project and should be maintained for the next decade if we base the future astropy.photometry
on it ... maybe it would be better to implement an option to the SExtractor repo itself that creates a shared library instead of only producing object files and a binary?
Although we still wouldn't be able to call it from photutils
as long as it's GPL licensed. Or @kbarbary, did you mean starting a C library with just aperture photometry functions independenly of sep
?
For the aperture stuff, I was suggesting starting a package independent of SEP, where we're not constrained by GPL. I've muddled the issue by putting an aperture function in SEP, but pretend that I hadn't. It's not very long and could easily be rewritten from scratch in a new "aperture photometry-only" library, licensed as we see fit.
For the source extraction & background stuff, it would be nice to use a relicensed SEP if possible. I'll ask Emmanuel for a more permissive license first, definitely.
I agree that it would be best if SExtractor were written by implementing a library and building an executable on top of that. However, that is extremely unlikely to happen. SExtractor is just not written this way and massive structural changes were necessary in order to "extract" the functionality into something that stands on its own.
LGPL-related question: Isn't wcslib LGPL and used in astropy? Or is there another license I'm not aware of?
I'm not going to weigh in on details of the licenses mostly due to ignorance, but my thoughts on the actual code side:
cdef
anything in the key for-loops, and it will probably be easier for other users to contribute improvements/fixes.@kbarbary You're right ... wcslib
is LGPL (see here) and is bundled in Astropy ... so LGPL is OK.
SExtractor is just not written this way and massive structural changes were necessary in order to "extract" the functionality into something that stands on its own.
I only briefly browsed the SExtractor code and you actually did it, so you are probably right. Plus there's the extra issue that (I think) SExtractor has ATLAS and FFTW3 as required dependencies, which means it can be hard to install for some users. But I still think it would be worth to ask about the option to change SExtractor to become a library with Emmanuel Bertin.
@eteq Personally I doubt that Cython is harder to read / write / debug than C code ... I find plain C / C++ simpler and I think there's more astronomers that know C than Cython. So I would be :+1: on @kbarbary's suggestion to make a small library of C functions and wrap them with Cython, but obviously this is a matter of taste and I guess in the end for Astropy and photutils we accept both and whatever they guy that does the work is acceptable?
@eteq
@kbarbary @cdeil - I disagree that the typical user will find C/C++ easier than Cython - for devs it's another story, but if I just walk up and down the halls of my institution, more people use IDL/Python/IRAF/etc. than C, so they'll find Cython more palatable/easier to contribute to. But that probably also depends on the place. And anyway, I agree that whoever is willing to put in the work can decide Cython-wrapped C vs. straight-Cython - in practice it looks pretty similar.
@kbarbary's comment about "only from Python" brings up a related question, though: is there an "audience" for a C-library that does all this independent of the python wrapper? If it's really only going to be used by photutils, it's better for it to be an integral part of photutils, because that makes everything simpler by reducing dependencies, and then Cython vs. Cython-wrapped C is personal preference. But if there's really call for a general C library to do this, that might make sense. But remember that releases and versioning and such are a non-trivial task for this sort of project, so it's effort that would be better spent elsewhere if it doesn't have a payoff...
Note that @mwcraig is starting to use https://pypi.python.org/pypi/Bottleneck in https://github.com/astropy/ccdproc/pull/69/files#diff-7 to get C-like speed for cases like median that are currently implemented with bad performance in Numpy.
When it comes to speed there's so many options each with advantages and disadvantages (e.g. Bottleneck generates huge C files and I've had a few build problems on Mac in the past and it's not clear if it will be maintained 5 or 10 years from now).
(just to clarify, I realize a lot of the work is done by @kbarbary in SEP if we can get the licensing figured out... but it's still an extra wrapper layer, which adds unnecessary complication all other things being equal)
@kbarbary Coming back to the results you posted in the original comment ... why is photutils
slower than sep
by a factor of 5 for a large aperture like 20 pix? Is it slower because the loop over apertures in in Python or because e.g. median (or whatever is done with the pixels) is implemented slowly? Is sep
as fast as SExtractor
for aperture photometry?
@eteq I absolutely agree that it is more work to also maintain the C code as an independent library, and also more complicated if it is kept in a separate repo.
However, such a C library is something that I felt should exist. As photutils would be one application, this PR was partially asking if others feel the same. :)
@kbarbary - Ok, I see what you're saying. I guess I'm suggesting that if the user base for the C layer is a lot smaller than the user base for the photutils
layer, it might make sense to move the C layer into photutils
. But if you don't like that (which I can completely understand), then if we can get some major speedups/new feature in photutils
with a pythonic API wrapping SEP, then I agree it's worth doing, as long as the wrapping is singificantly less work than just implementing from scratch would be (and I don't know SEP enough to judge that)
One other thing: because SEP is on github, it should be pretty straightforward to include it as a submodule in photutils via the new mechanisms in astropy_helpers
. So that's a plus.
FWIW, in ccdproc
we don't actually include bottleneck in the code, though we provide detailed documentation on its use because it is 1000x faster than a masked median in numpy (they are aware of the issue). I really want to just use numpy.
There I thought it was imperative to provide a well documented work-around because potential users would be turned away by a median combine of 10 3kx2k images that took 15 minutes.
Tangential at best: the sextractor
formula in the homebrew/science tap does not use ATLAS, it links to the OSX accelerate framework.
We definitely should extend the photutils/benchmarks scripts over the coming weeks to know where we stand with performance and where the bottlenecks are.
E.g. currently create_prf does use masked numpy arrays as well and might very well be super-slow for the median mode because it calls np.ma.median
... I haven't seen any benchmarking for the PSF code.
@cdeil Regarding performance: I expect that if you kept going to bigger apertures, photutils
would keep getting closer to the performance of sep
, at least up to a point. For r=100
the difference is only 2.5x. In photutils
the looping over pixels within an aperture is already done in cython, so that part should be as fast as the sep
wrapper. The difference is that sep
does the flux and variance sum along the way, whereas photutils
first returns an array giving the aperture overlap for each pixel, then mutiplies by the pixel values and sums everything. sep
does no memory allocation, whereas several arrays need to be allocated in photutils
for each aperture. photutils
also obviously has more Python overhead, including the loop over apertures. I'm not exactly sure of the relevant contribution between all these factors, but I think the temporary arrays play a big role for smaller apertures.
sep
should be as fast as SExtractor, as it is more-or-less doing the exact same thing (at least at the C level). There's no easy way to test this though.
I sent an email to Emmanuel asking about relicensing SEP. We'll see what he says.
SEP can be relicensed to LGPL. Emmanuel sent a nice email and explains:
I can allow you to relicense the derived code to LGPL. Going further is difficult at this time, as the SExtractor code is currently registered (at least in France) and is actually owned by CNRS.
@kbarbary Nice to hear that SEP
will be license-compatible and available for us to use.
I guess we should talk soon how that fits in with the work @bsipocz is doing in photutils
in one of our weekly photutils hangouts.
@kbarbary I only now realised now after your comment at https://github.com/kbarbary/sep/issues/1#issuecomment-45764043 that image input dtype
is an important parameter in this game. If I'm not mistaken your benchmark and sep
use float32
as input, but the first thing photutils
does is convert to float64
? Since aperture photometry is really a very simple operation it might be memory-CPU-throughput limited and this might explain why you find that photutils
is 2.5 times slower for large apertures.
If this is the case I would say that's an unfair comparison and in the end Cython or C should be equally fast, no? So the main advantage of using C is more readable / maintainable code?
@cdeil Do you really think C is more readable and maintainable than [c/p]ython? I would say quite the opposite! :smiley:
Here are timings when compiling seppy with PIXTYPE = double and changing the data array dtype to np.float64
:
r = 3.0
photutils: 123.545885 ms
seppy: 4.940033 ms
r = 5.0
photutils: 131.011009 ms
seppy: 8.491039 ms
r = 10.0
photutils: 153.373003 ms
seppy: 18.358946 ms
r = 20.0
photutils: 200.325012 ms
seppy: 41.104078 ms
r = 100.0
photutils: 963.091850 ms
seppy: 396.421909 ms
@larrybradley I know you were asking @cdeil, but I'll take a crack: I didn't actually mean to argue that pure C and a Cython wrapper is more readable/maintainable than just Cython. Rather, the way the code has been implemented could be simpler and faster if it were done differently. This could be accomplished using either just Cython or a combination of pure C and a Cython wrapper.
I personally would prefer to use the C/Cython route for a few reasons including code reuse outside Python and personal preference. But I wouldn't necessarily argue that it is more maintainable or readable.
@kbarbary @bsipocz @astrofrog and I discussed this issue at today's photutils hangout.
The main conclusion (I think, please add / correct me if I'm wrong) is that:
C
and / or Cython
code for performance-critical parts are very welcome ... everything that is in sep
and seppy
right now, probably even the background modeling stuff would be in scope for photutils
, but @kbarbary doesn't plan to put this in in the near future.photutils
and keep an eye on performance while doing so. Probably the main loss of performance (slower, more memory) in the current photutils
implementation results from the creation of extra arrays, whereas sep
avoids copying data and loops over the pixels only once. It's not clear how this could / should be integrated into photutils ... pull request or issues with concrete suggestions for @bsipocz how to implement it welcome.@larrybradley I did not mean C
code is more maintainable / readable than Python
code, but what @kbarbary mentioned in his last comment.
:+1: to what @cdeil said. Issues would be extremely helpful to see what needs to be implemented.
Circular aperture photometry in my SEP python wrapper is now fully featured, including subpixel sampling, error, gain, masks, even background subtraction in an annulus. See https://github.com/kbarbary/sep-python . Specifically, look under "Usage / aperture photometry" in the README.
Please do try it out (and hopefully marvel at the speed). I even built a conda package for linux (mainly to test the binstar process).
Here's some updated timing comparisons using latest master photutils. subpix=0 is exact mode.
aperture test | sep | photutils | ratio |
---|---|---|---|
circ. r= 3 subpix=0 | 3.31 us/aper | 102.90 us/aper | 31.06 |
circ. r= 5 subpix=0 | 5.32 us/aper | 106.31 us/aper | 19.97 |
circ. r= 10 subpix=0 | 12.08 us/aper | 116.66 us/aper | 9.66 |
circ. r= 20 subpix=0 | 29.36 us/aper | 145.73 us/aper | 4.96 |
circ. r=100 subpix=0 | 353.04 us/aper | 605.63 us/aper | 1.72 |
circ. r= 3 subpix=5 | 2.34 us/aper | 94.94 us/aper | 40.66 |
circ. r= 5 subpix=5 | 4.05 us/aper | 98.05 us/aper | 24.19 |
circ. r= 10 subpix=5 | 9.64 us/aper | 106.26 us/aper | 11.03 |
circ. r= 20 subpix=5 | 24.61 us/aper | 127.87 us/aper | 5.20 |
circ. r=100 subpix=5 | 329.08 us/aper | 518.93 us/aper | 1.58 |
@kbarbary - looks promising! All times are basically <ms for single apertures which is probably fast enough. Could you do tests with more apertures, for example 100 or 1000 or 10000?
I should have said that those results are for 1000 apertures. Here's the timings for a single aperture (looped 1000 times for consistent timing results):
aperture test | sep | photutils | ratio |
---|---|---|---|
circ. r= 3 subpix=0 | 12.97 us/aper | 1197.69 us/aper | 92.33 |
circ. r= 5 subpix=0 | 14.89 us/aper | 1188.07 us/aper | 79.81 |
circ. r= 10 subpix=0 | 20.46 us/aper | 1201.84 us/aper | 58.74 |
circ. r= 20 subpix=0 | 35.98 us/aper | 1227.66 us/aper | 34.12 |
circ. r=100 subpix=0 | 348.30 us/aper | 1664.00 us/aper | 4.78 |
circ. r= 3 subpix=5 | 10.49 us/aper | 1171.89 us/aper | 111.69 |
circ. r= 5 subpix=5 | 11.88 us/aper | 1178.18 us/aper | 99.21 |
circ. r= 10 subpix=5 | 16.24 us/aper | 1185.84 us/aper | 73.01 |
circ. r= 20 subpix=5 | 30.09 us/aper | 1203.98 us/aper | 40.01 |
circ. r=100 subpix=5 | 320.04 us/aper | 1580.47 us/aper | 4.94 |
I do agree that for most people, aperture photometry is not a bottleneck. But in some cases it can be.
@kbarbary - ok, then since those values are for 1000 apertures I agree that there is still work to do. In any case, photutils should now be architectured such that we could consider using sep as a backend instead of the cython code if we like.
But first, could you do the benchmarks but using the functions in:
https://github.com/astropy/photutils/blob/master/photutils/aperture_funcs.py
? It might be that most of the overhead comes from the OO interface.
Just tried using do_circular_photometry
... results are pretty much identical to the full-blown OO interface, so I don't think its adding significant overhead.
@kbarbary - ok, interesting, so therefore we might benefit from switching to SEP internally. I'll take a look to see if there are any obvious bottlenecks in the current code first.
If someone wants to tackle this, it should be easy to first try by having SEP as an external dependency and calling the Python wrapper in SEP.
By now I'm +10 to use SEP internally in photutils to get rid of duplicated code and a split in the developer / user community.
The question is ... does anyone have time to work on integrating it in the coming months?
(sep
would always remain as a standalone C library ... the question is if it really needs it's own set of Python wrappers or if photutils
could be the Python interface to it.)
@cdeil Regarding your parenthetical question, photutils would have to contain something pretty similar to the existing Python wrapper in sep (the whole Python part consists of a single Cython file)... so why not just use it? You could even import all the sep stuff into the photutils namespace whereever it belongs there.
Sure, if using sep
via it's own Python wrapper is simpler, we can do that.
My main point is that we should work together and reduce the low-level and high-level code duplication that already exists between sep
and photutils
(at the level of ~ 1000 lines of code?).
Agreed.
The duplication is pretty much this file in SEP: https://github.com/kbarbary/sep/blob/master/src/overlap.h (575 lines)
Realistically speaking this is not likely to happen. I'm not against to keep the issue, but we could also just close it.
Agreed! :+1:
I'm pleased to see that photutils is getting more attention these days! This can only be good.
This is a suggestion that one could gain considerable benefits by implementing more functionality in pure C, and wrapping it using Cython, particularly in the aperture photometry functionality. Note that this is not a comment on API design. One could always build arbitrarily complex APIs on top of the low-level interface.
There would be three main benefits of writing more in C:
Performance As one example,
aperture_photometry
, as written, creates many temporary arrays that could be avoided by writing more of the function in C. As a demonstration, I have a nascent C library for detection and photometry here: https://github.com/kbarbary/sep . I wrote a simple wrapper using Cython that does circular aperture photometry: https://github.com/kbarbary/seppy . Here is a comparison between photutils and the wrapper, for 1000 apertures in a 2k x 2k image (various aperture sizes):(results from running the file: https://github.com/kbarbary/seppy/blob/master/bench.py)
While this is not exactly a one-to-one comparison (the photutils function has to parse a few more optional parameters) it should give an idea of the magnitude of possible improvement.
Simplicity Some problems are well-suited to array operations. For such problems, writing expressions array-wise makes everything easier to understand. Aperture photometry is not such a problem. The current implementation tries to force the problem into an array-shaped box and as a result, ends up being a lot more complex than need be. I personally find it confusing, and I wrote a lot of it! In contrast, both the C function ( https://github.com/kbarbary/seppy/blob/master/src/aper.c#L53 ) and the wrapper ( https://github.com/kbarbary/seppy/blob/master/sep.pyx#L23 ) for sep are quite straightforward once you get past the fact that you have to read C.
Legacy If we create a pure C library for this purpose, we'll get the added benefit of having a nice C library. This could be used for C programs or wrapped in other languages (we may not be using Python forever...). To my knowledge, a widely used library like this does not yet exist.