astropy / photutils

Astropy package for source detection and photometry. Maintainer: @larrybradley
https://photutils.readthedocs.io
BSD 3-Clause "New" or "Revised" License
241 stars 134 forks source link

Use the SEP C library in photutils #67

Closed kbarbary closed 5 years ago

kbarbary commented 10 years ago

I'm pleased to see that photutils is getting more attention these days! This can only be good.

This is a suggestion that one could gain considerable benefits by implementing more functionality in pure C, and wrapping it using Cython, particularly in the aperture photometry functionality. Note that this is not a comment on API design. One could always build arbitrarily complex APIs on top of the low-level interface.

There would be three main benefits of writing more in C:

Performance As one example, aperture_photometry, as written, creates many temporary arrays that could be avoided by writing more of the function in C. As a demonstration, I have a nascent C library for detection and photometry here: https://github.com/kbarbary/sep . I wrote a simple wrapper using Cython that does circular aperture photometry: https://github.com/kbarbary/seppy . Here is a comparison between photutils and the wrapper, for 1000 apertures in a 2k x 2k image (various aperture sizes):

$ python bench.py
r = 3.0
  photutils: 130.003929 ms
  sep wrapper: 4.734993 ms
r = 5.0
  photutils: 138.437033 ms
  sep wrapper: 8.326054 ms
r = 10.0
  photutils: 158.399105 ms
  sep wrapper: 18.796921 ms
r = 20.0
  photutils: 206.316948 ms
  sep wrapper: 41.263103 ms

(results from running the file: https://github.com/kbarbary/seppy/blob/master/bench.py)

While this is not exactly a one-to-one comparison (the photutils function has to parse a few more optional parameters) it should give an idea of the magnitude of possible improvement.

Simplicity Some problems are well-suited to array operations. For such problems, writing expressions array-wise makes everything easier to understand. Aperture photometry is not such a problem. The current implementation tries to force the problem into an array-shaped box and as a result, ends up being a lot more complex than need be. I personally find it confusing, and I wrote a lot of it! In contrast, both the C function ( https://github.com/kbarbary/seppy/blob/master/src/aper.c#L53 ) and the wrapper ( https://github.com/kbarbary/seppy/blob/master/sep.pyx#L23 ) for sep are quite straightforward once you get past the fact that you have to read C.

Legacy If we create a pure C library for this purpose, we'll get the added benefit of having a nice C library. This could be used for C programs or wrapped in other languages (we may not be using Python forever...). To my knowledge, a widely used library like this does not yet exist.

cdeil commented 10 years ago

:+1: to having C code for core numerical algorithms where we get an order of magnitude speedup and more readable code.

We already have Cython extensions, so Cython-wrapped C doesn't make photutils harder to build and there's enough Astropy-devs that know C, so maintaining that code should not be an issue.

The main issues I see with sep is that at the moment it's GPL licensed ... you'd have to change to BSD before we can use it in photutils.

@kbarbary You mention you'd like to keep sep as a separate repo, as a C library that can be used in other places and I agree that's the right way to do it. But it's not clear to me what the best approach is for photutils:

I think maybe bundling it in photutils/cextern might be the best option because it's the most convenient for the end-user.

@kbarbary @astrofrog @eteq @bsipocz What do you think?

larrybradley commented 10 years ago

It looks like a lot of the C code came from SExtractor, i.e. sep is GPL because SExtractor is GPL. We'd likely need to write new C code from scratch (or convince SExtractor to go to BSD).

cdeil commented 10 years ago

SExtractor just changed from CeCILL to GPL recently ... I doubt they'd want to change license again. But I think the functions @kbarbary would like to implement in C for photutils are very simple and not hard to write from scratch ... @kbarbary do you have interest to do this in photutils (or another external BSD-licensed package)?

astrofrog commented 10 years ago

+1 to developing a small core C library that photutils could then provide a higher-level interface to. If we can develop the high-level interface and make sure it is well tested, then it should be easy to substitute out functions later with calls to this C library. This means that we should try and make sure we separate out the 'geometrical' code in photutils from the high-level API (which I've been advocating anyway).

kbarbary commented 10 years ago

Great to hear that there's some interest in this! There's really two separate issues regardling licenses and libraries that I've unfortunately compounded here:

For aperture photometry, it is simple enough that not a lot is gained by borrowing from SExtractor: we can write out own (small) library and license it however we want.

The more complicated issue is if we want to borrow the source extraction and background subtraction code from SExtractor. These are not simple to recreate (particularly source extraction), and I think they would be valuable to have in a library, which is why I started SEP. They're optimized and memory efficient. The license is a problem and @larrybradley is correct that SEP is GPL licensed because it is a direct adaptation of SExtractor. I didn't know that SExtractor recently relicensed (thanks for the info, @cdeil ) but I don't think we would have to ask SExtractor to relicense again. According to AUTHORS in SExtractor, Emmanuel Bertin is the only author. We would only need to ask him if we can relicense the specific parts of the code that have been adapted. He may be amenable to at least a LGPL license, which I think would be OK for us?

For my part, I'd be interested in starting and maintaining the C library, hopefully with the advice of others who are C experts - I'm not.

cdeil commented 10 years ago

I'm not sure LGPL would be OK for an "upstream" package that photutils calls into. This diagram says it's not, but there are other resources on the web that says they are compatible.

A package with the same BSD license as Astropy and many other Astro packages would be best ... @kbarbary if you contact Emmanuel Bertin, why not ask for that first?

Starting a C library with SExtractor functionality like sep is a big project and should be maintained for the next decade if we base the future astropy.photometry on it ... maybe it would be better to implement an option to the SExtractor repo itself that creates a shared library instead of only producing object files and a binary?

Although we still wouldn't be able to call it from photutils as long as it's GPL licensed. Or @kbarbary, did you mean starting a C library with just aperture photometry functions independenly of sep?

kbarbary commented 10 years ago

For the aperture stuff, I was suggesting starting a package independent of SEP, where we're not constrained by GPL. I've muddled the issue by putting an aperture function in SEP, but pretend that I hadn't. It's not very long and could easily be rewritten from scratch in a new "aperture photometry-only" library, licensed as we see fit.

For the source extraction & background stuff, it would be nice to use a relicensed SEP if possible. I'll ask Emmanuel for a more permissive license first, definitely.

I agree that it would be best if SExtractor were written by implementing a library and building an executable on top of that. However, that is extremely unlikely to happen. SExtractor is just not written this way and massive structural changes were necessary in order to "extract" the functionality into something that stands on its own.

LGPL-related question: Isn't wcslib LGPL and used in astropy? Or is there another license I'm not aware of?

eteq commented 10 years ago

I'm not going to weigh in on details of the licenses mostly due to ignorance, but my thoughts on the actual code side:

cdeil commented 10 years ago

@kbarbary You're right ... wcslib is LGPL (see here) and is bundled in Astropy ... so LGPL is OK.

SExtractor is just not written this way and massive structural changes were necessary in order to "extract" the functionality into something that stands on its own.

I only briefly browsed the SExtractor code and you actually did it, so you are probably right. Plus there's the extra issue that (I think) SExtractor has ATLAS and FFTW3 as required dependencies, which means it can be hard to install for some users. But I still think it would be worth to ask about the option to change SExtractor to become a library with Emmanuel Bertin.

cdeil commented 10 years ago

@eteq Personally I doubt that Cython is harder to read / write / debug than C code ... I find plain C / C++ simpler and I think there's more astronomers that know C than Cython. So I would be :+1: on @kbarbary's suggestion to make a small library of C functions and wrap them with Cython, but obviously this is a matter of taste and I guess in the end for Astropy and photutils we accept both and whatever they guy that does the work is acceptable?

kbarbary commented 10 years ago

@eteq

eteq commented 10 years ago

@kbarbary @cdeil - I disagree that the typical user will find C/C++ easier than Cython - for devs it's another story, but if I just walk up and down the halls of my institution, more people use IDL/Python/IRAF/etc. than C, so they'll find Cython more palatable/easier to contribute to. But that probably also depends on the place. And anyway, I agree that whoever is willing to put in the work can decide Cython-wrapped C vs. straight-Cython - in practice it looks pretty similar.

@kbarbary's comment about "only from Python" brings up a related question, though: is there an "audience" for a C-library that does all this independent of the python wrapper? If it's really only going to be used by photutils, it's better for it to be an integral part of photutils, because that makes everything simpler by reducing dependencies, and then Cython vs. Cython-wrapped C is personal preference. But if there's really call for a general C library to do this, that might make sense. But remember that releases and versioning and such are a non-trivial task for this sort of project, so it's effort that would be better spent elsewhere if it doesn't have a payoff...

cdeil commented 10 years ago

Note that @mwcraig is starting to use https://pypi.python.org/pypi/Bottleneck in https://github.com/astropy/ccdproc/pull/69/files#diff-7 to get C-like speed for cases like median that are currently implemented with bad performance in Numpy.

When it comes to speed there's so many options each with advantages and disadvantages (e.g. Bottleneck generates huge C files and I've had a few build problems on Mac in the past and it's not clear if it will be maintained 5 or 10 years from now).

eteq commented 10 years ago

(just to clarify, I realize a lot of the work is done by @kbarbary in SEP if we can get the licensing figured out... but it's still an extra wrapper layer, which adds unnecessary complication all other things being equal)

cdeil commented 10 years ago

@kbarbary Coming back to the results you posted in the original comment ... why is photutils slower than sep by a factor of 5 for a large aperture like 20 pix? Is it slower because the loop over apertures in in Python or because e.g. median (or whatever is done with the pixels) is implemented slowly? Is sep as fast as SExtractor for aperture photometry?

kbarbary commented 10 years ago

@eteq I absolutely agree that it is more work to also maintain the C code as an independent library, and also more complicated if it is kept in a separate repo.

However, such a C library is something that I felt should exist. As photutils would be one application, this PR was partially asking if others feel the same. :)

eteq commented 10 years ago

@kbarbary - Ok, I see what you're saying. I guess I'm suggesting that if the user base for the C layer is a lot smaller than the user base for the photutils layer, it might make sense to move the C layer into photutils. But if you don't like that (which I can completely understand), then if we can get some major speedups/new feature in photutils with a pythonic API wrapping SEP, then I agree it's worth doing, as long as the wrapping is singificantly less work than just implementing from scratch would be (and I don't know SEP enough to judge that)

One other thing: because SEP is on github, it should be pretty straightforward to include it as a submodule in photutils via the new mechanisms in astropy_helpers. So that's a plus.

mwcraig commented 10 years ago

FWIW, in ccdproc we don't actually include bottleneck in the code, though we provide detailed documentation on its use because it is 1000x faster than a masked median in numpy (they are aware of the issue). I really want to just use numpy.

There I thought it was imperative to provide a well documented work-around because potential users would be turned away by a median combine of 10 3kx2k images that took 15 minutes.

Tangential at best: the sextractor formula in the homebrew/science tap does not use ATLAS, it links to the OSX accelerate framework.

cdeil commented 10 years ago

We definitely should extend the photutils/benchmarks scripts over the coming weeks to know where we stand with performance and where the bottlenecks are. E.g. currently create_prf does use masked numpy arrays as well and might very well be super-slow for the median mode because it calls np.ma.median ... I haven't seen any benchmarking for the PSF code.

kbarbary commented 10 years ago

@cdeil Regarding performance: I expect that if you kept going to bigger apertures, photutils would keep getting closer to the performance of sep, at least up to a point. For r=100 the difference is only 2.5x. In photutils the looping over pixels within an aperture is already done in cython, so that part should be as fast as the sep wrapper. The difference is that sep does the flux and variance sum along the way, whereas photutils first returns an array giving the aperture overlap for each pixel, then mutiplies by the pixel values and sums everything. sep does no memory allocation, whereas several arrays need to be allocated in photutils for each aperture. photutils also obviously has more Python overhead, including the loop over apertures. I'm not exactly sure of the relevant contribution between all these factors, but I think the temporary arrays play a big role for smaller apertures.

sep should be as fast as SExtractor, as it is more-or-less doing the exact same thing (at least at the C level). There's no easy way to test this though.

kbarbary commented 10 years ago

I sent an email to Emmanuel asking about relicensing SEP. We'll see what he says.

kbarbary commented 10 years ago

SEP can be relicensed to LGPL. Emmanuel sent a nice email and explains:

I can allow you to relicense the derived code to LGPL. Going further is difficult at this time, as the SExtractor code is currently registered (at least in France) and is actually owned by CNRS.

cdeil commented 10 years ago

@kbarbary Nice to hear that SEP will be license-compatible and available for us to use. I guess we should talk soon how that fits in with the work @bsipocz is doing in photutils in one of our weekly photutils hangouts.

cdeil commented 10 years ago

@kbarbary I only now realised now after your comment at https://github.com/kbarbary/sep/issues/1#issuecomment-45764043 that image input dtype is an important parameter in this game. If I'm not mistaken your benchmark and sep use float32 as input, but the first thing photutils does is convert to float64? Since aperture photometry is really a very simple operation it might be memory-CPU-throughput limited and this might explain why you find that photutils is 2.5 times slower for large apertures. If this is the case I would say that's an unfair comparison and in the end Cython or C should be equally fast, no? So the main advantage of using C is more readable / maintainable code?

larrybradley commented 10 years ago

@cdeil Do you really think C is more readable and maintainable than [c/p]ython? I would say quite the opposite! :smiley:

kbarbary commented 10 years ago

Here are timings when compiling seppy with PIXTYPE = double and changing the data array dtype to np.float64:

r = 3.0
  photutils: 123.545885 ms
  seppy: 4.940033 ms
r = 5.0
  photutils: 131.011009 ms
  seppy: 8.491039 ms
r = 10.0
  photutils: 153.373003 ms
  seppy: 18.358946 ms
r = 20.0
  photutils: 200.325012 ms
  seppy: 41.104078 ms
r = 100.0
  photutils: 963.091850 ms
  seppy: 396.421909 ms
kbarbary commented 10 years ago

@larrybradley I know you were asking @cdeil, but I'll take a crack: I didn't actually mean to argue that pure C and a Cython wrapper is more readable/maintainable than just Cython. Rather, the way the code has been implemented could be simpler and faster if it were done differently. This could be accomplished using either just Cython or a combination of pure C and a Cython wrapper.

I personally would prefer to use the C/Cython route for a few reasons including code reuse outside Python and personal preference. But I wouldn't necessarily argue that it is more maintainable or readable.

cdeil commented 10 years ago

@kbarbary @bsipocz @astrofrog and I discussed this issue at today's photutils hangout.

The main conclusion (I think, please add / correct me if I'm wrong) is that:

cdeil commented 10 years ago

@larrybradley I did not mean C code is more maintainable / readable than Python code, but what @kbarbary mentioned in his last comment.

bsipocz commented 10 years ago

:+1: to what @cdeil said. Issues would be extremely helpful to see what needs to be implemented.

kbarbary commented 10 years ago

Circular aperture photometry in my SEP python wrapper is now fully featured, including subpixel sampling, error, gain, masks, even background subtraction in an annulus. See https://github.com/kbarbary/sep-python . Specifically, look under "Usage / aperture photometry" in the README.

Please do try it out (and hopefully marvel at the speed). I even built a conda package for linux (mainly to test the binstar process).

kbarbary commented 9 years ago

Here's some updated timing comparisons using latest master photutils. subpix=0 is exact mode.

aperture test sep photutils ratio
circ. r= 3 subpix=0 3.31 us/aper 102.90 us/aper 31.06
circ. r= 5 subpix=0 5.32 us/aper 106.31 us/aper 19.97
circ. r= 10 subpix=0 12.08 us/aper 116.66 us/aper 9.66
circ. r= 20 subpix=0 29.36 us/aper 145.73 us/aper 4.96
circ. r=100 subpix=0 353.04 us/aper 605.63 us/aper 1.72
circ. r= 3 subpix=5 2.34 us/aper 94.94 us/aper 40.66
circ. r= 5 subpix=5 4.05 us/aper 98.05 us/aper 24.19
circ. r= 10 subpix=5 9.64 us/aper 106.26 us/aper 11.03
circ. r= 20 subpix=5 24.61 us/aper 127.87 us/aper 5.20
circ. r=100 subpix=5 329.08 us/aper 518.93 us/aper 1.58
astrofrog commented 9 years ago

@kbarbary - looks promising! All times are basically <ms for single apertures which is probably fast enough. Could you do tests with more apertures, for example 100 or 1000 or 10000?

kbarbary commented 9 years ago

I should have said that those results are for 1000 apertures. Here's the timings for a single aperture (looped 1000 times for consistent timing results):

aperture test sep photutils ratio
circ. r= 3 subpix=0 12.97 us/aper 1197.69 us/aper 92.33
circ. r= 5 subpix=0 14.89 us/aper 1188.07 us/aper 79.81
circ. r= 10 subpix=0 20.46 us/aper 1201.84 us/aper 58.74
circ. r= 20 subpix=0 35.98 us/aper 1227.66 us/aper 34.12
circ. r=100 subpix=0 348.30 us/aper 1664.00 us/aper 4.78
circ. r= 3 subpix=5 10.49 us/aper 1171.89 us/aper 111.69
circ. r= 5 subpix=5 11.88 us/aper 1178.18 us/aper 99.21
circ. r= 10 subpix=5 16.24 us/aper 1185.84 us/aper 73.01
circ. r= 20 subpix=5 30.09 us/aper 1203.98 us/aper 40.01
circ. r=100 subpix=5 320.04 us/aper 1580.47 us/aper 4.94
kbarbary commented 9 years ago

I do agree that for most people, aperture photometry is not a bottleneck. But in some cases it can be.

astrofrog commented 9 years ago

@kbarbary - ok, then since those values are for 1000 apertures I agree that there is still work to do. In any case, photutils should now be architectured such that we could consider using sep as a backend instead of the cython code if we like.

But first, could you do the benchmarks but using the functions in:

https://github.com/astropy/photutils/blob/master/photutils/aperture_funcs.py

? It might be that most of the overhead comes from the OO interface.

kbarbary commented 9 years ago

Just tried using do_circular_photometry... results are pretty much identical to the full-blown OO interface, so I don't think its adding significant overhead.

astrofrog commented 9 years ago

@kbarbary - ok, interesting, so therefore we might benefit from switching to SEP internally. I'll take a look to see if there are any obvious bottlenecks in the current code first.

astrofrog commented 9 years ago

If someone wants to tackle this, it should be easy to first try by having SEP as an external dependency and calling the Python wrapper in SEP.

cdeil commented 9 years ago

By now I'm +10 to use SEP internally in photutils to get rid of duplicated code and a split in the developer / user community.

The question is ... does anyone have time to work on integrating it in the coming months?

(sep would always remain as a standalone C library ... the question is if it really needs it's own set of Python wrappers or if photutils could be the Python interface to it.)

kbarbary commented 9 years ago

@cdeil Regarding your parenthetical question, photutils would have to contain something pretty similar to the existing Python wrapper in sep (the whole Python part consists of a single Cython file)... so why not just use it? You could even import all the sep stuff into the photutils namespace whereever it belongs there.

cdeil commented 9 years ago

Sure, if using sep via it's own Python wrapper is simpler, we can do that. My main point is that we should work together and reduce the low-level and high-level code duplication that already exists between sep and photutils (at the level of ~ 1000 lines of code?).

kbarbary commented 9 years ago

Agreed.

The duplication is pretty much this file in SEP: https://github.com/kbarbary/sep/blob/master/src/overlap.h (575 lines)

bsipocz commented 5 years ago

Realistically speaking this is not likely to happen. I'm not against to keep the issue, but we could also just close it.

kbarbary commented 5 years ago

Agreed! :+1: