shihyuntang / igrins_rv

Welcome to IGRINS RV, a open source pipeline for extracting radial velocity (RV) for the Immersion GRating INfrared Spectrometer (IGRINS) instrument. Please contact asa.stahl@rice.edu or sytang@lowell.edu for more detail.
https://igrins-rv.readthedocs.io
MIT License
6 stars 1 forks source link

Miscellaneous notes from JOSS review #2

Closed gully closed 3 years ago

gully commented 3 years ago

Hi there :wave: I'll be reviewing this package for JOSS, the key ideas of the review can be found at https://github.com/openjournals/joss-reviews/issues/3095. The JOSS guidelines recommend keeping the details in Issues within this target repository, so I'll update this particular issue with realtime comments, and if the need arises will make standalone Issues as the review progresses.

Preliminaries

Repository size

The repository weighs in at 422 MB, with 209 MB in Engine, 50 MB in Example, and a 154 MB .git. Within the Engine directory, the syn_template is 177 MB. The relatively large size stems in part from the inclusion of the example data and synthetic templates. Have you considered the tradeoffs of storing the data and models separately and including a script to fetch them programmatically?

Inclusion of raw TelFit code

The TelFit source code is included in Telluric-Fitter-master, with this note in the Wiki:

Note that the original Telfit package is available on https://github.com/kgullikson88/Telluric-Fitter, but the files that come with IGRINS RV are modified, such that one line in ./src/TelluricFitter.py is different:

Is that one line a bug in TelFit? If so, have you considered filing a patch?

Citation of dependencies

The reference for the dependencies like astropy, numpy, and others may have to be included in the paper, see the JOSS guidelines:

A list of key references, including to other software addressing related needs.

gully commented 3 years ago

Brittle TelFit Installation

Installation of the telfit code hinges on having the right fortran and gcc versions installed, and in some cases requires direct modification of the source code to get TelFit to compile, in turn making it difficult to install igrins_rv. What upstream changes to TelFit could streamline the installation of igrins_rv?

gully commented 3 years ago

Running the code

Quiet mode

Running the code prints out a lot of TelFit info to stdout. It may be useful to enable a verbose mode or equivalently a quiet mode that replaces this text with a progress bar to help the user gauge how much time remains. The text can be redirected to a console log if users want to inspect the instantaneous output.

shihyuntang commented 3 years ago

Running the code

Quiet mode

Running the code prints out a lot of TelFit info to stdout. It may be useful to enable a verbose mode or equivalently a quiet mode that replaces this text with a progress bar to help the user gauge how much time remains. The text can be redirected to a console log if users want to inspect the instantaneous output.

Hi @gully , Yes! nice suggestion, and we was thinking of turn off those printouts, but... unable to find a way to do so. Any suggestion on how to turn off the printout or redirects to a log file will be appreciated!!

Thank you!

gully commented 3 years ago

Overview and Workflow

The Overview and Workflow documentation describes the order and conditions under which the various steps can be run. The conditional, iterative, and/or in-series nature of each step sequence takes some mental mapping when a newcomer is reading for the first time. A diagram summarizing the key inputs and outputs of each step could make it easier to understand the workflow at a glance.

gully commented 3 years ago

FAQ

I recommend completing the FAQ since I have a few of the same questions.

gully commented 3 years ago

Documentation

Most of the documentation resides in the wiki describing the core commandline interface (CLI) to run igrins_rv in three or four steps. The documentation in the wiki pages is adequate to reproduce the example. Most functions and classes in Engine lack docstrings. These Engine functions are not intended to be directly called by the user.

gully commented 3 years ago

Testing

The main vehicle for testing appears to be the functional testing provided through running the scripts on example input data. Running the code on this example data successfully produces RV values and uncertainties with scatter comparable to the 25 m/s claimed performance. The individual functions within Engine do not have unit testing.

gully commented 3 years ago

Community

The JOSS review guidelines seek some steps to supporting a community :

There should be clear guidelines for third-parties wishing to:

  • Contribute to the software
  • Report issues or problems with the software
  • Seek support

The README file currently discourages public contributions:

This package is NOT yet ready for public use

I recommend the inclusion of the JOSS community guidelines to the README, replacing the public use warning. The code appears mature enough that outside users could reasonably benefit from and contribute back to the application. Contributors should be encouraged at the present stage, since they essentially act as beta testers. Some further steps could be taken to lower the barrier to entry for new contributors (see the next topic).

Roadmap, Open Issues, Feature Ideas

A project roadmap would go a long way to facilitate community involvement. It allows users to see where the project is going, its known issues, and whether features-of-interest are on the horizon. As part of this review, I recommend a project roadmap, either as a wiki page or---as can be common practice---a standalone GitHub Issue. GitHub now allows selected issues to be "pinned" so the prominence of a roadmap can be denoted in this way, if preferred.
I also recommend the chronicling of known Issues and feature requests in standalone GitHub Issues. GitHub Issues have a community building effect: the presence of a few issues and workaround discussions invites users to submit their own bug reports, and gives the maintainers a sense for how the software is being used in practice. To catalyze thinking on these new issues and roadmap, here is a brainstorm of some "evergreen" topics that could have a standalone Issue, and if applicable could be discussed in a Roadmap.

  1. Runtime Performance: What is the current runtime performance (in minutes per spectrum or whatever preferred unit), and what plans are there to decrease the runtime? Are the runtime bottlenecks known, if so what are they, and if not is there a plan to investigate them?
  2. Precision Performance: The code claims a precision of 25 m/s, what sets that limit and is there any plan to increase precision further? The precision limits may be the primary concern for some PRV practitioners and having a forum for interested parties to contribute questions or ideas could be synergistic, even if you believe you have already hit the systematic noise floor of the data themselves.
  3. Modulatity/API-ification: Currently the code can only be run as a command-line interface (CLI). What are the prospects of making the code more modular so that third-party packages can make programmatic calls outside of the shell?
shihyuntang commented 3 years ago

Hi @gully , Here are our reply to your comments above:

Repository size --> With removing the Telfit pkg and some unused data under syn_template, the repository is now weighs at 358 MB. We would like to keep the example data and synthetic templates in one place.

Inclusion of raw TelFit code --> TelFit source code removed from igrins rv repo.

Citation of dependencies --> A new paragraph has been added under the Statement of need:

IGRINS RV makes use of the astropy (Astropy Collaboration et al., 2018, 2013) on handing sky coordinates and barycentric velocity correction, scipy (Virtanen et al., 2020) and numpy (Harris et al., 2020) on mathmatical calculation, nlopt (Box, 1965; Johnson, 2008) on the optimization process, pandas (McKinney, 2010; team, 2020)on data management, and matplotlib (Hunter, 2007) on plotting. We also borrowed a part of code from BMC (Duarte & Watanabe, 2021) for peak detection. IGRINS RV requires that the igrins plp v2.2.0 (Lee et al., 2017) and Telfit (Gullikson et al., 2014) packages be pre-installed. Detailed documentation and tutorials can be found on the GitHub wiki page.

Brittle TelFit Installation --> Make Telfit able to be installed via pip should solve this issue. After that, we will only need to add Telfit in the .yml file for the installation.

Quiet mode --> done

Overview and Workflow --> Thank you for this suggestion. A Workflow Diagram for IGRINS RV has been added under Overview and Workflow wiki page.

FAQ --> done.

Documentation --> Docstrings for functions under Engine are all been added.

Testing --> Currently, the main method of testing the code is by running the test data provided. We will consider to add function units test function in the future if needed.

Community --> Ways to Contribute to the software, Report issues or problems with the software, and Seek support have been added in the README.md as

IGRINS RV is now under intense development. If you have any question, idea or wish to report any bug, please let us know by either open an issue or contact us (asa.stahl@rice.edu or sytang@lowell.edu). More on how to contribute can be found in the FAQ page. On FAQ you can now find How do I report bugs? and How do I help make IGRINS RV better?.

As for the

This package is NOT yet ready for public use Thank you for bringing this up. Yes, we should remove this by now because the code is pretty much stable.

*Roadmap, Open Issues, Feature Ideas --> I really appreciate this suggestion. For IGRINS RV being my first big project on Github, I have never taught of these before. An issue page https://github.com/shihyuntang/igrins_rv/#7 has been opened for ideas of future improvement.

Precision Performance --> We have added this in the FAQ to explain the limitation on the RV precision:

IGRINS RV uses telluric (atmospheric) absorption lines as a common-path wavelength calibrator, so its precision is limited by the internal RV stability of Earth's atmosphere. This has been estimated to be ~10-20 m/s, so it is unlikely IGRINS RV will achieve better than the current ~25m/s precision. If you are obtaining worse precisions than this for your science targets and want to try to push toward better precisions, improving the accuracy of your stellar templates or increasing the signal quality of your data could help (note the precision achievable by IGRINS RV also depends on the vsin(i) of the target in question).

Thank you again for all these great suggestion. Now the master branch is updated with the latest version of the code.

shihyuntang commented 3 years ago

Dear @gully ,

Is there anything else that you wish us to improve to meet the requirement for the unchecked box in the JOSS REVIEW issue page?

Thank you.