Open teald opened 6 months ago
Hi there! Thank you for submitting your package for pyOpenSci review. Below are the basic checks that your package needs to pass to begin our review. If some of these are missing, we will ask you to work on them before the review process begins.
Please check our Python packaging guide for more information on the elements below.
import package
.
The package installation does not install the dependenciesREADME.md
file with clear explanation of what the package does, instructions on how to install it, and a link to development instructions.CONTRIBUTING.md
file that details how to install and contribute to the package.CODE_OF_CONDUCT.md
file.NOTE: We prefer that you have development instructions in your documentation too.
YAML
header of the issue (located at the top of the issue template).Nothing to say from the EiC initial perspective, it's a strong submission!
Hey there @teald,
@hamogu will take over this submission as the editor 🫶 You are in good hands, happy reviewing y'all!
:wave: Hi @aaryapatil and @mwcraig! Thank you for volunteering to review for pyOpenSci!
Before beginning your review, please fill out our pre-review survey. This helps us improve all aspects of our review and better understand our community. No personal data will be shared from this survey - it will only be used in an aggregated format by our Executive Director to improve our processes and programs.
The following resources will help you complete your review:
Reviewers: @aaryapatil @mwcraig Due date: end of June
@teald : @aaryapatil @mwcraig have agreed to review your package! Thanks! Links to the review guide and the template to get them started are in the post above; and of course they might open issues (or PRs!) in your repro if they have suggestions. In the meantime, feel free to reach out to me with any questions you might have.
@aaryapatil @mwcraig : I remember you both told me that you would need a little longer than the "three weeks" that we usually allocate to the first round of review and with the holiday coming up in the US, I understand that you are probably busy with other things this week. However, if you can make a realistic estimate when you can find time to review, it would be great if you could post here, so that the package authors know where we are in the process. Thank you so much for agreeing to review this package!
@hamogu Thank you for the ping. I am working on the review now but won't have much time this week. Realistically, I should have my review done by next week. Hope that is okay!
UPDATE: The rest of the review will be completed by July 16.
@hamogu -- thanks for the reminder; I should be able to get it done by the end of the day on Monday, July 15.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
add_header_to_table
are well described elsewhere in the documentation. The only mention of add_header_to_table
was in the MEF documentation and doesn't include any explanation. I guess the main thing I'm confused about is whether add_header_to_table
adds an empty header that the user then needs to fill or whether it adds already-existing data.add_header_to_table
and it could be addressed by saying a little more in the documentation about it.pyproject.toml
file or elsewhere. 👈
pyproject.toml
that points to the project home on github and to the documentation would be great. That would add a "Project links" section in pypi. See the astropy pyproject.toml
for how to add the URLsReadme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
CITATION.md
, but consider adding a link to it in the README.md
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
tests/test_object_construction.py::test_append_array_to_root_no_name ERROR
tests/test_object_construction.py::test_append_array_to_root_with_name_sci ERROR
tests/test_object_construction.py::test_append_array_to_root_with_arbitrary_name ERROR
tox
started doing its thing, but I would have liked to have restricted the tests to just python 3.11. tox
tried to do several versions.pylint
. Running pylint astrodata
on the source generates a number of errors, but these are almost all complaints about code complexity rather than format. More explicit instructions for running the lint test would be helpful, but the formatting is fine.Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
@teald this is ready for a look
Thanks for submitting this! 🎉 I am really excited about the potential of this package to fill some holes in the ecosystem. There are some details below in the "folded" sections that should be straightforward to address. Here are the high-level comments (background: I'm one of the maintainers for astropy.nddata
and the sole maintainer for ccdproc
):
astropy.nddata
? Or to ccdproc
?astrodata
or try to integrate usage of astrodata
into it? ccdproc
has never had a good way of handling MEF files, which is faintly ridiculous (I'm the maintainer of ccdproc
so I'm looking in the mirror rather throwing stones here).astrodata
provides a way to abstract images and metadata from the underlying way they are stored, which is something that none of the current tools that I'm aware of provide. It may very well not make sense to upstream any of this.astrodata
that goes beyond just adding properties and tags? In otherwords, once I have done those things what does astrodata
do for me? I'm not suggesting a full reduction pipeline here (DRAGONS does that) but something that shows a step or two of processing files using would be helpful.Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.
pyproject.toml
. Refer to the packaging guide here.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
astrodata
beyond using astropy.io.fits
alone. The current text only briefly mentions the extendability of astrodata
.CITATION.md
is present but hasn't been referenced in the README.md
. The README.md
is the first point of entry into the project GitHub, so adding a citation there is important.Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
poetry run tox -rp
(as suggested in the quickstart) works well, but skips the coverage, probably because a specific version of python wasn't installed. Traceback is available below. It would be useful to add more information for testing a specific version of python. Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 10
Thank you for submitting this package @teald. I have completed my first pass, so you can go through it now. The package is already in great shape! If there are any questions, let me know. I will also be in touch in case something else comes up. I am looking forward to further using astrodata
.
Pull requests opened as part of review:
Thank you both for your reviews! This feedback has been incredibly useful. I will continue working on these points and will respond to more general comments later this week or next week.
Update August 6: things have been taking a bit longer than anticipated. I'm hoping to have this wrapped up & a response ~this upcoming Monday (Aug. 12)
Update August 15: there have been some setbacks, I will just respond when this is finished. Hopefully soon 🤞 without further issues. Thanks for your patience!
Update September 6: We are waiting on feedback from a collaborator/contributor on the reviewer response, and should have it ready next week :)
Hi all, I think I've addressed everything covered by the review. Thank you both for your PRs, issues, and notes here. They've helped immensely so far!
I'll itemize (for ease of reference) specific response points and how I've fixed them, then summarize other changes to the code.
These are in order of reading the responses in the checklist above, with the ~sections in (parentheses).
add_header_to_table
has been deprecated, as its functionality is no longer used. It will be removed in version 3.0.0.pyproject.toml
nox
).pre-commit
alone.lint.yml
workflow just runs pre-commit
to keep linting settings in pyproject.toml
/.pre-commit-config.yaml
nox -s initialize_pre_commit
, automatically called by nox -s devshell
and nox -s devconda
to make it easier to set up the developer environment with pre-commit
. This is still under testing.astrodata
in the README
All issues raised as part of the initial review have been addressed:
All PRs have been merged into the main code:
Thank you for your contributions!
tox
for testing to nox
for testing and general automation. This was already planned for after this review to better support conda
testing, but it made more sense to get this done now.1. What is your conception of how this should interact with the rest of the ecosystem (thinking mostly about nddata and ccdproc here)?
astrodata
uses astropy.nddata
for most of its core arithmetic functionality, including a number of the same, or slightly modified, mixins. There are likely some points of consolidation (see below) between the two that may happen. As for ccdproc, I think with CCDData
already inheriting NDDataArray
(which is similar to NDAstroData
), there are opportunities for integration there. I would need to spend some more time thinking about that and looking at source, though.
Since astrodata
provides support for MEF, as you pointed out in another question, as well as meta-data abstraction that is closely tied to the representation of the data compared to other tools, this streamlines the process of associating data and metadata from a user perspective. Users do not have to worry about managing the overhead that arises from handling lists of files and metadata tables. The descriptors provide a standardized way to create generic processing tools that hide boilerplate from the user, and tags allow for organization based on metadata properties or other traits of the data.
2. Does it make sense to upstream any of this (like the arithmetic handling or allowing for any WCS, not just an astropy.wcs) to astropy.nddata? Or to ccdproc?
There are plans to upstream some of the work done in the astrodata.wcs module to gwcs, specifically regarding the conversion between gwcs objects and their representation as FITS keywords, and then use gwcs throughout astrodata. But, that work hasn't been started yet and it's not clear when the resources will be available. I think that's probably the better avenue for more generic WCS support than astropy.nddata
.
As mentioned above, I think there are some good opportunities for taking some of the handling done by astrodata
and integrating it into, e.g., CCDData
, where astrodata.AstroData
objects naturally fit into the existing NDDataArray
dependencies. I'm sure there are nuances there that would need to be sorted, but I could see CCDData
/ccdproc using some of the features of astrodata
to enhance their current functionality.
We could also consider how astrodata
might assist/impact the goals of the closed APE 11. However, I think that'd require larger-scale coordination after this review has concluded (as part of APE PR #100).
3. Does it make sense for ccdproc to depend on astrodata or try to integrate usage of astrodata into it? ccdproc has never had a good way of handling MEF files, which is faintly ridiculous (I'm the maintainer of ccdproc so I'm looking in the mirror rather throwing stones here).
It depends on the goals of ccdproc moving forward. For MEFs it's feasible to break images up from a list into individual images ccdproc can then process. astrodata
may, at the end of the day, be excessive for the work ccdproc does in its current scope.
4. My take is that astrodata provides a way to abstract images and metadata from the underlying way they are stored, which is something that none of the current tools that I'm aware of provide. It may very well not make sense to upstream any of this.
I think the biggest consideration is whether resources required to make these changes are worth the benefits themselves. There are natural places where astrodata
's abstraction would help generalize/make other Astropy packages more flexible.
Right now, though, the work required to share data between, e.g., astrodata
, astropy.nddata
, and ccdproc is straightforward but requires some management between them that could be reduced to utility methods. I wrote up a quick, simple example to see how working with both astrodata
and ccdproc
went:
This is a pretty minimal example, of course, but I think cases where the two interact can be managed primarily through accessing underlying data and metadata directly, rather than creating outright support for astrodata
within the other packages.
5. Would it be possible to provide a small example of how to develop a processing tool with astrodata that goes beyond just adding properties and tags? In otherwords, once I have done those things what does astrodata do for me? I'm not suggesting a full reduction pipeline here (DRAGONS does that) but something that shows a step or two of processing files using would be helpful.
This is still being worked on. I'd like this example to be relatively complete in overview of how astrodata
works, and I've been taking time coming up with something suitable. I think it's unavoidable to have it be somewhat complex, since astrodata
and the properties and tags drive much of what users and developers will commonly interact with.
Development is being tracked in this issue. For now, the User Manual and Programmer's manuals have a bit more in-depth explanation, though I agree a more concrete, featureful example would be ideal.
Thank you for your review and feedback! Let me know if you have any further questions or comments.
I see a very detailed reply from the package author thanks to @teald for explaining so detailed what work you have done to address the review!
@aaryapatil and @mwcraig: Please have a look and if you agree that that addresses all your concerns, please tick the box "The author has responded to my review and made changes to my satisfaction. I recommend approving this package." by editing the message of your review (and I appreciate a small note here, because GH will notify me if you post a new comment, but not if you edit an existing comment); if there is more to be done, please reply here so that we all know.
@aaryapatil and @mwcraig : I know we all can get busy at times, but it would be really great if you could check if the changes and replies by @teald address your concerns. If you think that the package should be accepted now, please edit your response above to tick the box "The author has responded to my review and made changes to my satisfaction. I recommend approving this package." by editing the message of your review (and I appreciate a small note here, because GH will notify me if you post a new comment, but not if you edit an existing comment); if there is more to be done, please reply here so that we all know.
If for some reason, you can't get to check that in the next two weeks or so, please let me know, so that we can proceed with this review in some other way! Thanks for your help and apologies to @teald that I'm letting you wait this long!
@hamogu -- thanks for the reminder. I should be able to take a look no later than monday.
Hi @mwcraig and @aaryapatil It would be really great if you find time to check if your review has been answered. I understand that everyone is busy, but you've already spend som much time on the initial review, this should be much faster. Please reach out to me directly if you can't do that at all for any reason. @teald wrote a detailed response two months ago and I don't want to let them hanging. Please let me know if there is anything I can do to move this along.
I have checked that my review has been answered and have edited my response above. Thank you for the detailed response, @teald, and my sincere apologies for the delay, @hamogu. This unfortunately slipped through my attention.
Submitting Author: D.J. Teal (@teald) All current maintainers: @teald, @chris-simpson, @jehturner Package Name: astrodata One-Line Description of Package: Common interface for astronomical data products. Repository Link: https://github.com/GeminiDRSoftware/astrodata Version submitted: 2.9.2 Editor: @hamogu Reviewer 1: @aaryapatil Reviewer 2: @mwcraig Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD
Code of Conduct & Commitment to Maintain Package
Description
astrodata
is a package meant to facilitate developing common interfaces for astronomical data formats. Often, specific instruments and models will have different ways of storing their data, including metadata.astrodata
offers a single interface that uses metadata to resolve these disparate file formats, while enabling common operations and values to share the same interface. The abstraction is meant to be conceptually simple and meaningful to scientists.Previously,
astrodata
was a core module within the DRAGONS package, and has been used for the various instruments that exist at the Gemini Observatory. It has proved useful in consolidating differences in metadata and data formatting between instruments that produce FITS files, which is a common pattern in astronomical data handling. Alongside automating interface selection based on these differences, it also comes with helpful operators and methods out-of-the-box, by extendingastropy
'sNDData
class.Scope
Please indicate which category or categories. Check out our package scope page to learn more about our scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):
Domain Specific
Community Partnerships
If your package is associated with an existing community please check below:
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Astronomers & astronomical software developers
There are no specific packages we are aware of. There is some overlap with the
gwcs
package and ourwcs
module, but we are planning to collaborate with that package for future development and to reconsolidate those overlaps. Astrodata offers an interface based onastropy.nddata.NDData
, but allowing more than one instance to be mapped to the same file (e.g., to multiple sets of FITS extensions).@tag
the editor you contacted:N/A
Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Publication Options
JOSS Checks
- [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.*Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Confirm each of the following by checking the box.
Please fill out our survey
P.S. Have feedback/comments about our review process? Leave a comment here
Editor and Review Templates
The editor template can be found here.
The review template can be found here.