openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
720 stars 38 forks source link

[REVIEW]: Spelunker: A quick-look Python pipeline for JWST NIRISS FGS Guidestar Data #6202

Closed editorialbot closed 5 months ago

editorialbot commented 9 months ago

Submitting author: !--author-handle-->@GalagaBits<!--end-author-handle-- (Derod Deal) Repository: https://github.com/GalagaBits/JWST-FGS-Spelunker Branch with paper.md (empty if default branch): Version: v.1.1.17 Editor: !--editor-->@ivastar<!--end-editor-- Reviewers: @evamariaa, @taylorbell57 Archive: 10.5281/zenodo.11115733

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/1547289b35b16a5d2955b20ed3945a2e"><img src="https://joss.theoj.org/papers/1547289b35b16a5d2955b20ed3945a2e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/1547289b35b16a5d2955b20ed3945a2e/status.svg)](https://joss.theoj.org/papers/1547289b35b16a5d2955b20ed3945a2e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@evamariaa & @taylorbell57, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @ivastar know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @evamariaa

📝 Checklist for @taylorbell57

editorialbot commented 9 months ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 9 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1088/1538-3873/acd1b5 is OK
- 10.1117/12.926578 is OK
- 10.1117/12.2311973 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 9 months ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.07 s (531.3 files/s, 152258.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          12            682            523           1174
reStructuredText                14            983           2233            440
Jupyter Notebook                 4              0           4127            378
Markdown                         4             71              0            173
TeX                              1              3              0             42
YAML                             2              6             24             28
Bourne Shell                     1              0              0              3
-------------------------------------------------------------------------------
SUM:                            38           1745           6907           2238
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 9 months ago

Wordcount for paper.md is 538

editorialbot commented 9 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

evamariaa commented 9 months ago

Review checklist for @evamariaa

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

taylorbell57 commented 9 months ago

Review checklist for @taylorbell57

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

ivastar commented 9 months ago

@evamariaa & @taylorbell57 welcome to the review thread! I see you have already generated the checklists. We are looking for review in 3-4 week. Please reach out if you have any questions.

ivastar commented 9 months ago

@evamariaa and @taylorbell57, reminder about the review. It would be fantastic if you can get the initial reviews submitted by ~February 5th, 2024 so we can move the submission along.

taylorbell57 commented 8 months ago

@ivastar, I just wanted to make it clear that I have an insigificant "potential conflict of interest" with the second-author Néstor Espinoza. The list of papers where Néstor and I are coauthors are: 1, 2, 3, 4, 5, 6. For all of papers 1-4 and 6, I was a minor coauthor and didn't actually collaborate with Néstor in any direct way. For paper 5, I contributed one of the analyses to the paper but still did not actually collaborate with Néstor in any direct way.

I do not believe that this will in any way interefere with my ability to impartially review the code or paper, I just wanted to make this clear in advance in compliance with the Conflict of Interest Policy.

taylorbell57 commented 8 months ago

@GalagaBits and @nespinoza, can you clarify what version number we are supposed to be reviewing? On GitHub the lastest release/tag that I see is v.1.0.2, but pip installs v1.1.0 from PyPI (which also has v1.0.3 and v1.0.4 which are missing from GitHub).

taylorbell57 commented 8 months ago

Also, note that the images are missing from the PyPI posting at https://pypi.org/project/spelunker/#description

evamariaa commented 8 months ago

@ivastar, I just wanted to make it clear that I have an insigificant "potential conflict of interest" with the second-author Néstor Espinoza. The list of papers where Néstor and I are coauthors are: 1, 2, 3, 4, 5, 6. For all of papers 1-4 and 6, I was a minor coauthor and didn't actually collaborate with Nestor in any direct way. For paper 5, I contributed one of the analyses to the paper but still did not actually collaborate with Néstor in any direct way.

I do not believe that this will in any way interefere with my ability to impartially review the code or paper, I just wanted to make this clear in advance in compliance with the Conflict of Interest Policy.

Similarly to Taylor I quoted above, I also have a potential but very minor conflict of interest and I alerted the editor @ivastar about this in person who confirmed that it is okay to be a reviewer.

evamariaa commented 8 months ago

Minor comment about the documentation @GalagaBits: It'd be useful to have the link to the documentation on the right 'About' section in the GitHub repository. It took me a minute or two to find the link all the way at the bottom of the page.

taylorbell57 commented 8 months ago

To address the review point about describing the "State of the field", I think it would be best to add a sentence or two describing and citing other relevant packages, and in particular astroquery which is one of the closest tools to yours and is one of your key dependencies. This added text will help explain to the average reader what already exists and what new features you're enabling/facilitating.

ivastar commented 8 months ago

@ivastar, I just wanted to make it clear that I have an insigificant "potential conflict of interest" with the second-author Néstor Espinoza. The list of papers where Néstor and I are coauthors are: 1, 2, 3, 4, 5, 6. For all of papers 1-4 and 6, I was a minor coauthor and didn't actually collaborate with Nestor in any direct way. For paper 5, I contributed one of the analyses to the paper but still did not actually collaborate with Néstor in any direct way. I do not believe that this will in any way interefere with my ability to impartially review the code or paper, I just wanted to make this clear in advance in compliance with the Conflict of Interest Policy.

Similarly to Taylor I quoted above, I also have a potential but very minor conflict of interest and I alerted the editor @ivastar about this in person who confirmed that it is okay to be a reviewer.

Thanks for the disclosuse @taylorbell57 and @evamariaa! I think it is close to impossible to find two people who have not been on a paper together in this field. I am ok with waiving the CoI requirements in this case. But if the authors have concerns, please do reach out to me in private.

taylorbell57 commented 8 months ago

Alright, I've now finished reading through the paper, docs, and code and have opened several issues (linked above) in addition to the few small comments I made above which will hopefully make them more user friendly, robust, and easy to maintain. That wraps up my first round referee report.

Overall, great work on this @GalagaBits and @nespinoza - very excited to experiment with this in my own research!

evamariaa commented 8 months ago

I think I can also finish up my first review here, the majority of things on my list are covered in the issues Taylor's opened as well and if not I linked the issues I opened above. I haven't had the chance yet to reproduce all example notebooks and go through every bit of documentation in great detail but I see that Taylor has done an amazing job in doing that so I'm happy for this to move forward! It's a great package @GalagaBits and @nespinoza - well done!

ivastar commented 8 months ago

Thank you @evamariaa and @taylorbell57 for the speedy review! @GalagaBits and @nespinoza the ball is back in your court. Please review the comments and respond to them in the respective issues. Once completed, please let us know here that we should proceed.

GalagaBits commented 8 months ago

My apologies for responding late. Thank you for the feedback. I am currently reviewing them and will let you know I when I am finished.

ivastar commented 7 months ago

@GalagaBits pinging this thread. It has been over a month. Do you have a timeline for addressing the comments?

GalagaBits commented 7 months ago

@ivastar I have been addressing the comments as issues on https://github.com/GalagaBits/JWST-FGS-Spelunker/issues. I plan to complete the rest of the comments by the end of the month.

GalagaBits commented 6 months ago

@ivastar I'm working on the last 4 issues on the paper. I am graduating this semester and had to work on my thesis and my coursework, so its taking me awhile to complete the comments. Is it okay if you can set a deadline for me within the next week to complete the comments?

GalagaBits commented 6 months ago

@ivastar, @evamariaa, @taylorbell57 Okay, I have finished replying to all issues and comments!

GalagaBits commented 6 months ago

@taylorbell57 The current version is v1.1.11. This should now be reflected in GitHub and PyPI

taylorbell57 commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

taylorbell57 commented 6 months ago

@GalagaBits, how easy would it be to add a progress bar to the gauss2d_fit function? I'm testing the fgs-spelunker-and-tsos.ipynb notebook out right now, and I can't tell if the function call has crashed or it is just taking a long time since there is no progress bar

GalagaBits commented 6 months ago

@taylorbell57, I'll have to research on how to implement a progress bar then test it, so that will take a while since I am currently in finals season.gauss2d_fit is likely taking a while due to the large amount of frames.

taylorbell57 commented 6 months ago

So it seemed my first attempt had indeed crashed (took >45 minutes), but in trying to understand what happened I found a way to get a progress bar; I'll submit a pull request to enable that useful feature. The main issue though is that you're currently using ray to parallelize each individual fit rather than running multiple fits in parallel. Each individual fit takes far less than a second, so this method of parallelization offers little-to-no gains (in fact, using a tool like htop I can see that regardless of what I set ncpus to, the function only really uses ~1 thread since it finishes too fast to use any more threads). Now that I can tell that the function is actually working (with my temporarily hacked-in progress bar), neither of these issues are something that should hold back the publication of the paper. I'll just open issues and/or pull requests for them on your GitHub.

I'll continue reviewing/testing the other components of your package, paper, and documentation - looking good so far!

GalagaBits commented 6 months ago

Thank you for that! Also, I looked at Activity Monitor when ever Spelunker runs Gaussian fits, the usage does not seem very high. Using more cores with ray somewhat speed up the process a bit from my experience, but I feel that I can implement ray better. Splitting the data into chunks depending on the available cores/threads and running each chunk with ray might be more efficient and much faster. Do you have any suggestions?

And feel free to tell me about any questions or comments!

GalagaBits commented 6 months ago

@taylorbell57 I have updated the paper to correct some errors

nespinoza commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

taylorbell57 commented 6 months ago

@GalagaBits soon I'll open an issue on the Spelunker repo to discuss this in detail (since it's not really relevant to this review anymore) and provide tips on how you could (likely) greatly speed this function up, but for reference the loop at line spelunker.py#L1260 took me 8m4s with the full 28 threads available on my CPU (a desktop Intel i7-14700K) and 7m51s with only 1 thread, while the loop at spelunker.py#L1272 took 18m46s with 28 threads and 19m04s with only 1 thread

GalagaBits commented 6 months ago

Okay, that sounds good!

taylorbell57 commented 6 months ago

Alright, things are looking much improved. Outside of the newly opened GitHub issues linked above this message, the only things that I think still need to be addressed are:

After that's all taken care of, then I think you should be good to go!

GalagaBits commented 6 months ago

@taylorbell57 Okay! I have taken care of the remaining tasks for this paper.

nespinoza commented 6 months ago

@editorialbot generate pdf

editorialbot commented 6 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

taylorbell57 commented 6 months ago

The paper looks good to me now, and I just pointed out two minor issues with the pixel_centroid_mnemonics.ipynb notebook in Issue https://github.com/GalagaBits/JWST-FGS-Spelunker/issues/41

Once those very minor comments are addressed, I'm happy to recommend the paper for publication @ivastar

GalagaBits commented 6 months ago

@taylorbell57 Okay, I have fixed those remaining issues!

GalagaBits commented 6 months ago

Do I have to wait for @ivastar or @evamariaa now? What are the next steps?

ivastar commented 6 months ago

Thank you all for the great discussion here, a lot of improvements to all parts of this submission!

We still need @evamariaa to sign off on the submission and then we can proceed to the publication process. @evamariaa can you please review the changes and let us know if you have any follow up comments?

evamariaa commented 5 months ago

I've gone through the issues now and the remaining things on my check list, as well as re-installed the new version and re-ran notebooks and test. I'm very happy with the improvements! I would note though that it'd be nice if you could update your example notebooks in the notebooks folder such that they work with your most recent version as this is what everyone will use. For example, when I had version 1.1.16 installed, then the function spk2.flux_spatial_timelapse_animation gave me an error and then I realised you were using a different version which you do state in the text at the top of the notebook so totally my bad. So I don't think this is necessary for this review, but it'll safe a lot of issues/time if you keep those notebooks up-to-date with your current version.

Either way, I'm very happy to recommend the paper for publication. Congrats on this great work!

ivastar commented 5 months ago

@GalagaBits do you think you can update the notebook as per the suggestion made by @evamariaa? I think this falls under the "Example usage" part of our review. It is important for the examples to reflect the published API.

ivastar commented 5 months ago

@editorialbot generate pdf

ivastar commented 5 months ago

@editorialbot check references

editorialbot commented 5 months ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1088/1538-3873/acd1b5 is OK
- 10.1117/12.926578 is OK
- 10.1117/12.2311973 is OK
- 10.1086/160554 is OK
- 10.1007/BF00648343 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 5 months ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left: