quatrope / astroalign

A tool to align astronomical images based on asterism matching
MIT License
142 stars 44 forks source link

add: bottleneck support #75

Closed KyleLeneau closed 2 years ago

KyleLeneau commented 2 years ago

Recently I have been working on a project that uses astroalign, ccdproc and astropy. In the former two I had discovered that bottleneck was an easy drop in replacement for some numpy functions and had impressive results. I had been wondering if astroalign could benefit from the same and turns out it easily can, so that's what I started here.

What do you think?

In the process I found it a little hard to work with the repo and know what dependencies if needed and since I am using pipenv and pyenv I thought it would be helpful to add support for pip dependencies and virtual environments.

This PR is a WIP but I wanted to get it raised ASAP for feedback and see if there was an appetite for this and what I might be missing in order to get it merged for others.

martinberoiz commented 2 years ago

Hi Kyle,

Thanks for your interest. I did not know about bottleneck. I looked it up, seems interesting. The big question is how much speed up would I get and if it's worth the effort.

Aside from that I thought maybe it could be faster if you did something like this inside the main module astroalign:

median = bn.nanmedian if HAS_BOTTLENECK else _np.nanmedian
mean = bn.nanmean if HAS_BOTTLENECK else _np.nanmean
... #etc          

and then just use median and mean wherever. I'm worried about the cost of the function call in your PR.

Also I don't really use poetry (I assume that's what the Pipfile is from?). I just use plain old setup.py, that's where I set the dependencies.

In any case, I want to do tests with bottleneck first and try it out. To be honest, I'm usually reluctant to introduce dependencies, unless they are actually needed. Maybe this could be an optional dependency.

KyleLeneau commented 2 years ago

Hey Martin,

Thanks for the feedback. The speed increase I am seeing in other projects like ccdproc and astroypy are in the range of 3x to 5x increase. So it can be a lot faster if the consumer of the library wants it.

I will change to what you suggested for methods to avoid the function. I had just copied what astropy and ccdproc had done initially.

The Pipfile is from pipenv which is compatible with setuptools but make specifying dependencies and using a virtualenv easier. The trouble I had with the repo is that I had to hunt down what libraries were needed to run the tests benchmarks, also your setup.py is missing a number of dev dependencies to run all the code. I will clean this up a bit more and add to the readme so you can see how it's totally optional but really helps new contributors.

Totally understand on wanting to do some testing of your own on performance. I would like to understand from you what the benchmarks you had run before, what was the system they were run on, that way we can see a before and after.

Also, just to be clear this does not introduce a required dependency for users of astroalign and it is 100% optional. If you have installed the bottleneck dependency then the HAS_BOTTLENECK will be true and bottleneck functions will be used otherwise the default numpy functions will be used.

KyleLeneau commented 2 years ago

@martinberoiz I have pushed some changes over top of my last commit (for clean history)

KyleLeneau commented 2 years ago

@martinberoiz any update on this?

martinberoiz commented 2 years ago

Hi, thanks for the reminder! Sorry I forgot all about this.

I checked a bit more and I use the mean once and median twice, but if it can be sped up without adding dependencies, sure why not?

But I'll ask you to please don't add extra files, just add your changes on astroalign.py file, I think that should work. To install you can try doing pip install . or pip install -e . on the project root folder.

Let me know if you need help.

KyleLeneau commented 2 years ago

@martinberoiz I have gone ahead and removed the extra files for this repo. I had added them because I switch python versions a lot and don't ever want to rely on my systems installed version =) I can make do without them though.

Let me know what else you'd like to see to get this merged. Thanks!

martinberoiz commented 2 years ago

Looks good to me, there's only some minor style issues that I can fix later. I'll merge and make a patch release so that is public. Thanks for your contribution!

KyleLeneau commented 2 years ago

@martinberoiz what's the ETA on getting a patch release out with this?

martinberoiz commented 2 years ago

Hi Kyle, the patch is public in version 2.4.1. It's up on pypi.