weaverba137 / pydl

Library of IDL astronomy routines converted to Python.
BSD 3-Clause "New" or "Revised" License
28 stars 16 forks source link

Astropy Affiliated Package Review #33

Open astrofrog opened 6 years ago

astrofrog commented 6 years ago

This package has been re-reviewed by the Astropy coordination committee in relation to the Astropy affiliated package ecosystem.

We have adopted a review process for affiliated package that includes assigning quantitative ‘scores’ (red/orange/green) for different categories of review. You can read up more about this process here. (This document, currently in Google Docs, will be moved to the documentation in the near future.) For each of the categories below we have listed the score and have included some comments when the score is not green.

Functionality/ScopeGeneral%20package
No further comments
Integration with Astropy ecosystemGood
No further comments
DocumentationNeeds%20work
The documentation at the moment just consists of API docs, but it really should include a minimum of narrative/introduction, and some examples of usage
TestingPartial
Test coverage is currently 62%, which could be improved
Development statusGood
No further comments
Python 3 compatibilityGood
No further comments

Summary/Decision: There are some areas in which this package did not score well with the new review process (in this case the documentation). As per the review guidelines, these areas should be fixed – if review areas stay in the red for too long, we may consider removing the package from the affiliated package list. However, in this case, simply adding some narrative docs would bring the score for the docs up to at least orange, so it should be an easy fix.

If you agree with the above review, please feel free to close this issue. If you have any follow-up questions or disagree with any of the comments above, leave a comment and we can discuss it here. At any point in future you can request a re-review of the package if you believe any of the scores should be updated - contact the coordination committee, and we’ll do a new review. Note that we are in the process of redesigning the http://affiliated.astropy.org page to show these scores (but not the comments). Finally, please keep the title of this issue as-is (“Astropy Affiliated Package Review”) to make it easy to search for affiliated package reviews in future.

weaverba137 commented 6 years ago

@astrofrog et al., thank you for the review. I agree that the documentation needs improvement. That is not hard, it just requires some time. I also agree that test coverage needs improvement. However, there, I want to reach out for additional advice. I know I've mentioned this in other forums, but I'm reaching a point where additional coverage would require potentially large, remote test data sets, or where increasing the coverage might require "atomizing" existing algorithms into such small functions that performance might degrade. Do you have any resources for people who might be in this situation? I suspect I'm not the only one.

weaverba137 commented 6 years ago

Pinging @astrofrog.

weaverba137 commented 6 years ago

Pinging @astrofrog.

adrn commented 6 years ago

Hey @weaverba137 -- I suspect @astrofrog has silenced his github notifications. I'll post this issue in the Astropy slack channel.

weaverba137 commented 6 years ago

@adrn, on what thread/channel did you post this issue?

astrofrog commented 6 years ago

Sorry for the delay, too much travel. Back soon and will reply properly to this.

astrofrog commented 6 years ago

@weaverba137 - I don't know of existing resources for the data problems you mention, but I think @aphearin had a similar issue with halotools, so maybe he can comment on how he solved the big data issue? How big are we talking here?

Regarding atomizing the current code, I don't understand this - are there certain parts of the code that would never get executed currently? (and if so, why are they needed?)

weaverba137 commented 6 years ago

@astrofrog, by "big" I'm talking an amount of data that would almost certainly be impractical to import into a Travis test.

By "atomizing", for example, suppose a particular function is hundreds or thousands of lines. In order for a test to cover the function, the function has to execute. One could imagine a path to increasing test coverage would be to split up that function into many smaller functions, and testing each smaller function, where possible. So we've increased the coverage at the cost of breaking a possibly efficient function into many, many separate function calls. What I'm wondering is, at what point does algorithmic performance trump test coverage?

aphearin commented 6 years ago

@weaverba137 - I had to wrestle with this issue in Halotools. This is a python package that processes cosmological simulations and dark matter halo catalogs, and generates Monte Carlo realizations of synthetic galaxy populations for purposes of cosmological analysis. So, it's very much a data-intensive package with stringent performance demands.

There were two strategies I used to handle unit-testing for cases like you describe:

  1. I wrote some code that generates a fake simulation on-the-fly within the unit-test. That let me use Travis to test the correctness of functions requiring a full-tilt halo catalog, without having to write anything at all to disk.
  2. I also wrote tests that actually do ingest real halo catalogs, but I used just use the normal try...except pattern within the unit-test so that the test passes in environments where the data are not available, and runs in environments where the data are available. Such tests do not run on Travis, but I still run them locally before accepting relevant PRs.

As for breaking apart functions into atomic pieces, I did quite a lot of this as well (particularly using Cython kernels) and I have not suffered performance problems due to this that I am aware of. In all cases that I encountered in Halotools, the overhead cost of even a dozen small function calls was totally swamped by the actual algebraic operations performed on the ~1-100Gb dataset. So I also have some functions that execute many thousands of lines, but in my case I did not find any performance problems associated with breaking those functions up into many smaller, individually unit-tested functions. However, my use-case permitted this design. It's easy enough for me to imagine scenarios where this is more difficult than it was for me. Let me know if you'd like a sounding board about a specific example that is giving you trouble, it's possible I encountered a similar obstacle.

weaverba137 commented 6 years ago

@aphearin, thank you I will take a look at your code. I can certainly understand not running tests if the data are unavailable, but then what actually defines the "official" coverage of the package? My understanding is that the coverage of the package is the number reported by Coveralls.io. The number reported by Coveralls.io is the result of a Travis test. If certain functions don't run in a Travis test for whatever reason, then the "official" coverage is lower than what it could be, assuming all the data were available.

In other words, if one cannot rely on Travis tests to produce an accurate coverage number, how does one inform the Astropy coordination committee of what the coverage actually is?

astrofrog commented 6 years ago

For the purposes of the review, the coverage is not necessarily what is reported by coveralls but rather what I could get if I run it locally if there are instructions on doing more extensive tests than can be done on Travis. So if I could get a higher coverage locally, it would make sense to add a section to the docs explaining how to run the tests locally with all the data.

aphearin commented 6 years ago

In my case, I use the local-machine-only tests for science verification only. The vast majority of my big data tests come from unit-tests executed against the on-the-fly halo catalogs I described. The local-only tests do very little to improve my coveralls coverage, which is high because of the fake halo catalogs. The fake halo catalogs took a little time to set up properly, but they're a core component of my testing.

weaverba137 commented 5 years ago

@astrofrog et al., I finally have some time to work on this; I haven't forgotten. I've already made substantial progress on the documentation.

astrofrog commented 5 years ago

@weaverba137 - great! Just let us know at any point if you want a re-review :)