openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
721 stars 38 forks source link

[REVIEW]: gemlog: Data Conversion for the Open-Source Gem Infrasound Logger #5256

Closed editorialbot closed 1 year ago

editorialbot commented 1 year ago

Submitting author: !--author-handle-->@ajakef<!--end-author-handle-- (Jacob Anderson) Repository: https://github.com/ajakef/gemlog Branch with paper.md (empty if default branch): paper Version: 1.6.5.0 Editor: !--editor-->@bmcfee<!--end-editor-- Reviewers: @thelenwes, @hemmelig Archive: 10.5281/zenodo.8015143

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/56051d4674913c170373d37f05558dc3"><img src="https://joss.theoj.org/papers/56051d4674913c170373d37f05558dc3/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/56051d4674913c170373d37f05558dc3/status.svg)](https://joss.theoj.org/papers/56051d4674913c170373d37f05558dc3)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@thelenwes & @hemmelig, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @bmcfee know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Checklists

📝 Checklist for @thelenwes

📝 Checklist for @hemmelig

editorialbot commented 1 year ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf
editorialbot commented 1 year ago
Software report:

github.com/AlDanial/cloc v 1.88  T=0.20 s (244.7 files/s, 32443.7 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          19            613           1218           2824
Markdown                         8            176              0            541
reStructuredText                12            152             40            229
YAML                             4             29             30            146
TeX                              1              7              0            140
Cython                           1             19             37             96
DOS Batch                        1              8              1             26
make                             1              4              7              9
CSS                              1              1              4              6
-------------------------------------------------------------------------------
SUM:                            48           1009           1337           4017
-------------------------------------------------------------------------------

gitinspector failed to run statistical information for the repository
editorialbot commented 1 year ago

Wordcount for paper.md is 1023

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- None

MISSING DOIs

- 10.1785/0220170067 may be a valid DOI for title: The Gem infrasound logger and custom-built instrumentation
- 10.1029/2021gl093013 may be a valid DOI for title: The first detection of an earthquake from a balloon using its acoustic signature
- 10.1785/0220180038 may be a valid DOI for title: Explosion-generated infrasound recorded on ground and airborne microbarometers at regional distances
- 10.2172/1829264 may be a valid DOI for title: Evaluation of Low Cost Infrasound Sensor Packages.
- 10.1093/gji/ggy069 may be a valid DOI for title: Acoustic event location and background noise characterization on a free flying infrasound sensor network in the stratosphere
- 10.1029/2021ea002036 may be a valid DOI for title: Evidence for Short Temporal Atmospheric Variations Observed by Infrasonic Signals: 1. The Troposphere
- 10.1088/1749-4699/8/1/014003 may be a valid DOI for title: ObsPy: A bridge for seismology into the scientific Python ecosystem
- 10.1175/jtech-d-19-0175.1 may be a valid DOI for title: Multihour stratospheric flights with the heliotrope solar hot-air balloon
- 10.2172/1863279 may be a valid DOI for title: Data Report: TurboWave I and II Data Release.

INVALID DOIs

- None
bmcfee commented 1 year ago

@ajakef while the reviewers get to work, please add the missing DOI's noted above to the paper's bibliography.

ajakef commented 1 year ago

Just added all the DOIs for the references that have DOIs. One technical report and all the conference abstracts appear not to have DOIs.

hemmelig commented 1 year ago

Review checklist for @hemmelig

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

hemmelig commented 1 year ago

Overview

Note: I might edit this/add further comments where I drill down into some of the points raised below.

Hi Jake (@ajakef)—I’ll start by saying the whole Gem project looks great! This sort of joint open-source hardware/software project—brought together with mostly off-the-shelf kit—is really cool to see. I’ve mostly worked with commercial seismic systems (Guralp/Nanometrics analog instruments + dataloggers), but have recently been working on a system in this vein for remote data collection and telemetry. We hope to create a similar collection of hardware/software repositories and your project will definitely be an exemplar reference point for us.

I’ve gone through the JOSS checklist and, bar where indicated below, everything looks in order to me. The manuscript set out the motivations for—and capabilities of—the gemlog package in a clear manner. On a purely stylish note, would it be possible to highlight the package name and command line utility names with inline code formatting in the manuscript? The reference list seems thorough, though my knowledge of the infrasound field is somewhat limited. The Data Report (line 115 of the original proof, Dannemann Dugick and Bowman) should probably read “TurboWave I and II…” as opposed to “TurboWave i and II”. You might indicate somewhere (probably in the documentation, rather than setting it in stone in the JOSS paper) a rough definition of ‘low-cost’, based on the prices of the components used.

General comments Who is TLSatterwhite? They have made some contributions to the code in the repository. Besides that, the contribution and authorship looks good to me.

You might consider releasing the code and assigning it its own DOI using something like Zenodo.

Data are available within the repo for testing purposes.

Per the GitHub Community Standards tab (under Insights), the repository is missing a few items:

The repository readme does include (good + clear) information about contribution guidelines, so you might consider putting these into the corresponding file as well just for the sake of completeness. I don't think you need all of the above, it was just a

You might consider incorporating the documentation (e.g. for installation, for the demos, etc) into the existing readthedocs. You could then add some of the information from the paper regarding the problem the software is designed for/target audience to the landing page of the readthedocs site.

How long does it take to convert, say, a 1 month campaign, per station? Just a ballpark figure would be useful (minutes, hours, days?). I get a rough estimate on the order of 10 minutes per station.

Installation The installation process was smooth and clearly documented. I was able to install gemlog both directly from the Python Package Index (using pip install gemlog) and also downloading the source code and installing from there (pip install . from the source directory). The authors uses a GitHub Action to build and release the package to the Python Package Index.

Tests Again, the authors use a GitHub Actions to automatically test and lint any changes made and pushed to either a branch of the main repo, or a fork. However, in order to run these tests locally on my own machine, I had to install pytest (and also had to clone the source repo). You might consider adding some information to the documentation detailing what someone might need to do to test any changes made locally, without having to push changes to a remote repo.

Tests all ran locally without any issues, though there were a number of warnings that seemed to be about the implementation of the tests themselves.

Demos In general, you might consider converting these markdown files to a series of Jupyter notebooks, which can be hosted and run in the browser using a (free) service like mybinder. This would also make it straightforward to include these notebooks in the readthedocs site. Besides that, they cover the basic applications of the package well. I have some minor, less code-related, comments that you might consider things to consider in the future.

Demo 1 How might a user determine whether a file is bad?

Is there a reason why you can’t specify a station_info.txt file as an input to the gemconvert tool that will populate the SEED headers at this stage? Rather than

summarize_gps indexes from 0 rather than 1—presumably an i rather than i + 1 in the print statement.

Demo 2 Great concept, love to see it. Something we’ve been meaning to do for some time with a number of the QC scripts we have developed over the years for seismic network data validation/huddle tests/pre-archive QC! The explainers at the end of the report are excellent, but I think should ultimately have a dedicated section in the online docs?

Point the reader to the data convert tutorial as a reminder.

061 BATTERY WARNING – this doesn’t seem to be correct? Claiming 2.81 V is within 0.5 V of limit (1.7 V).

Is there a way you suggest validating whether data agree in timing? Visually assessing the waveform similarity is fine, but validating down to 0.01 s is a little more difficult.

Comments on the code Below I’ve listed some general comments that I consider important enough to warrant addressing prior to approving this submission. There is a subsequent section of comments that are more just things to consider as tasks to be tackled in the future, time permitting.

I think it would be worth providing a set of aliases for the command line tools that are all prefixed by gem. This would make things more consistent and avoid any potential namespace clashes (though unlikely). Further to this, you might consider adding something like gemlog as an entry point that lists all of the entry point options?

See previous comment about providing the demos as Jupyter notebooks.

The default library package pathlib is preferable over os.path. I’ve not tested the package on Windows, but I did note a number of instances of hardcoded path delimiters (/) that might fail if someone tries to use the package on Windows. No requirement to support Windows, but could be useful to be aware of this.

Python thoughts—entirely optional The comments below go beyond what I think would be required to accept this in JOSS:

Apply some form of autoformatting, e.g. black (can be installed with pip install black). I see there is linting in the GH Actions, but passing everything through black often helps me organise my source code. This can be configured to, for example, allow 100 character lines (per the README) by adding a pyproject.toml file to the repo, containing:

[tool.black]
line-length = 100

The import section in many of the files is a bit disorganised, often with a number of unnecessary imports. I don’t think there’s a definitive PEP standard for how to format this part of a Python source file, but often something like the following is used:

import standard_lib1
from standard_lib2 import foo

import third_party_lib1
import third_party_lib2

import local_package

Many(/all) of the top-level functions are documented, but many of the underlying/internal functions are not. There seems also to be an inconsistent level of documentation across the package (presumably just as habits changed through time/different individual contributions).

Many bare exceptions – might make things easier in short term, but might create latent issues that are hard to pick up on/track down.

Mixed use of getopt and argparse—I think the latter is slightly nicer to work with/idiomatic.

fstrings make for cleaner print statements, and also implicitly handle conversion of variables to their string representation.

Some function names don’t follow PEP conventions (snake_case)—a quick example is in gem_cat.py with the AppendFile function.

I decided to run through some of the above suggestions for one of the command line tools (gem_cat, as this seemed to only have a few uses). I validated the outputs remain the same using the refactored version using the example data provided in demo 3. I can either attach it as a comment below, or open an Issue—no expectation to merge it (since it’s not been tested against any edge cases), more included to highlight some of the above suggestions.

hemmelig commented 1 year ago

Just to reiterate—super cool project! If you've any additional examples of messy/weird/broken data that you think might be helpful for me to play around with manipulating with the package, let me know.

kthyng commented 1 year ago

Hi just poking my head in here! Looks like @hemmelig has essentially finished their review – looks like a few items left to work out between reviewer and author. And @thelenwes still needs to start their review.

@ajakef Have you been able to address any comments from @hemmelig's review yet?

thelenwes commented 1 year ago

I'm aware. Just trying to get it prioritized.

Wes

Weston Thelen Research Seismologist Cascade Volcano Observatory @.*** (360) 993-8977 orcid: 0000-0003-2534-5577


From: Kristen Thyng @.> Sent: Tuesday, April 18, 2023 6:50 AM To: openjournals/joss-reviews @.> Cc: Thelen, Weston A @.>; Mention @.> Subject: [EXTERNAL] Re: [openjournals/joss-reviews] [REVIEW]: gemlog: Data Conversion for the Open-Source Gem Infrasound Logger (Issue #5256)

This email has been received from outside of DOI - Use caution before clicking on links, opening attachments, or responding.

Hi just poking my head in here! Looks like @hemmelighttps://github.com/hemmelig has essentially finished their review – looks like a few items left to work out between reviewer and author. And @thelenweshttps://github.com/thelenwes still needs to start their review.

@ajakefhttps://github.com/ajakef Have you been able to address any comments from @hemmelighttps://github.com/hemmelig's review yet?

— Reply to this email directly, view it on GitHubhttps://github.com/openjournals/joss-reviews/issues/5256#issuecomment-1513195634, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAIQZSJDTCH6LKDG3UNRYQLXB2LZ7ANCNFSM6AAAAAAV3BWGBE. You are receiving this because you were mentioned.Message ID: @.***>

ajakef commented 1 year ago

Thanks for the great detailed review, @hemmelig! I'll work on addressing your comments. To answer one easy question, contributor TLSatterwhite is Tamara Beschorner, the third author on this paper.

@kthyng, should I wait for both reviews to finish before pushing changes? Or make changes as soon as I can?

Also, thanks for agreeing to review this, @thelenwes; I know it's a time commitment.

ajakef commented 1 year ago

@hemmelig, thanks for the offer to share your gem_cat work. Could you please submit it as an issue?

kthyng commented 1 year ago

@ajakef I would recommend working on the reviewer comments as soon as you can.

thelenwes commented 1 year ago

Review checklist for @thelenwes

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

thelenwes commented 1 year ago

Overview The gemlog package is something that I've been working with for a few years now so I'm happy to see it progress to this state. I'm coming at this review from a user's perspective since I don't have the programming chops of the other reviewer. Overall the package works as expected and with the help of the checklist and I think the submission is mostly ready to publish. I've just included a few nit picky things below that might be useful below. Nice work!

The "project folder" in installation is ill-defined (especially if they don't read the "Pre-processing Workflow" first)

This warning is thrown with gemconvert:

/opt/miniconda3/envs/gem/lib/python3.10/site-packages/gemlog/core.py:612: FutureWarning: Calling int on a single element Series is deprecated and will raise a TypeError in the future. Use int(ser.iloc[0]) instead config = {key:int(line[key]) for key in list(line.keys())[1:]}

Elevation API: https://developers.google.com/maps/documentation/elevation/overview https://open-elevation.com/

Glad to see that the development on the huddle test has come along. This is a really useful function. When I ran it against an old huddle test, things worked as expected, however the report contained 25 pages of errors (the same two) repeated each minute:

SN ['158'] recorded temperatures greater than 1 on either side of the temperature median 21.8 on 2022-06-08 02:36:00. Total temperature range = 2.4 (alpha)

SN ['105', '158'] recorded temperatures greater than 1 on either side of the temperature median 18.8 on 2022-06-07 07:53:00. Total temperature range = 4.8 (alpha)

This is a little strange because all 6 instruments in the huddle test were within 1.5 degrees the whole time (according to the graph). In any case, it might be worth splitting the error outputs to a separate file from the plots and tables. There is information at the end of the report that is otherwise buried when there is an output with a bunch of errors in it.

There is no obvious statement of need in the documentation, only in the paper.

bmcfee commented 1 year ago

Thanks again @hemmelig and @thelenwes for your reviews!

@ajakef have you had a chance to take a look at the issues raised here? Please let us know when things have been addressed so we can keep this moving.

It appears that both @hemmelig and @thelenwes have left the "statement of need" box unchecked. To clarify, this is only really required in the paper, though it is nice to have in the documentation as well. So if that's not a blocker for you, please check the box to complete the review form.

ajakef commented 1 year ago

First, thank you both @hemmelig and @thelenwes for your helpful reviews. I understand that reviewing someone else's software is a significant effort and I appreciate the time you put into it. Thanks also @bmcfee for overseeing this manuscript.

I've updated the paper branch with edits reflecting the reviewer comments. https://github.com/ajakef/gemlog/blob/paper/paper/reviewer_response_hemmelig.md https://github.com/ajakef/gemlog/blob/paper/paper/reviewer_response_thelenwes.md

Regarding the checklists: I see that @hemmelig's checklist has the documentation "statement of need" box and the "references" box unchecked. I have made the change to the references noted in the paper, and I also added a "statement of need" to the main README.md. @bmcfee, does this look ok, and are there additional things I need to do?

editorialbot commented 1 year ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

editorialbot commented 1 year ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

bmcfee commented 1 year ago

Post-Review Checklist for Editor and Authors

Additional Author Tasks After Review is Complete

Editor Tasks Prior to Acceptance

bmcfee commented 1 year ago

@editorialbot generate pdf

bmcfee commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1785/0220170067 is OK
- 10.1029/2021gl093013 is OK
- 10.1785/0220180038 is OK
- 10.2172/1829264 is OK
- 10.1007/s00445-022-01607-y is OK
- 10.1093/gji/ggy069 is OK
- 10.17615/nseg-wy10 is OK
- 10.1029/2021ea002036 is OK
- 10.1088/1749-4699/8/1/014003 is OK
- 10.1175/jtech-d-19-0175.1 is OK
- 10.2172/1863279 is OK

MISSING DOIs

- None

INVALID DOIs

- None
editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

bmcfee commented 1 year ago

Thanks @ajakef ! (And sorry for the comment spam with the bot.)

Please check the above comment https://github.com/openjournals/joss-reviews/issues/5256#issuecomment-1576823589 for a final checklist of (author) tasks to complete, and I'll get to work on the editor tasks.

bmcfee commented 1 year ago

I read through the paper, and it all looks good. The only comment I have is that on line 68, 10^-4 should be properly typeset using math notation.

ajakef commented 1 year ago

Thanks @bmcfee. I fixed the 10^-4 issue.

Question: parenthetical citations of both of my own papers include my first name. No other citations do that. Is this a normal part of JOSS's format? I'd prefer that it not do that because it looks like I'm highlighting my own work and not others'.

I went through the author checklist. Am I supposed to be able to check those boxes? It mostly doesn't let me click them, so here it is here. Done: Double check authors and affiliations (including ORCIDs)

Done (1.6.5.0): Make a release of the software with the latest changes from the review and post the version number here. This is the version that will be used in the JOSS paper.

Done (10.5281/zenodo.8015143): Archive the release on Zenodo/figshare/etc and post the DOI here.

Done: Make sure that the title and author list (including ORCIDs) in the archive match those in the JOSS paper.

Done: Make sure that the license listed for the archive is the same as the software license.

bmcfee commented 1 year ago

Thanks @ajakef - sorry for the confusion on the checklist, but I can check these off for you as they're completed.

I fixed the 10^-4 issue.

:+1:

Question: parenthetical citations of both of my own papers include my first name. No other citations do that. Is this a normal part of JOSS's format? I'd prefer that it not do that because it looks like I'm highlighting my own work and not others'.

This is strange. I wonder if it's getting confused because your bibtex has both Anderson, Jacob F and Anderson, J.F.: https://github.com/ajakef/gemlog/blob/cdbd83c415aaba2bb7e45a96e258b619c017fcd3/paper/paper.bib#L3

https://github.com/ajakef/gemlog/blob/cdbd83c415aaba2bb7e45a96e258b619c017fcd3/paper/paper.bib#L92

Probably best to standardize them first in the bibtex source and see if that resolves things.

bmcfee commented 1 year ago

@editorialbot set 1.6.5.0 as version

editorialbot commented 1 year ago

Done! version is now 1.6.5.0

editorialbot commented 1 year ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@editorialbot commands

bmcfee commented 1 year ago

@editorialbot set 10.5281/zenodo.8015143 as archive

editorialbot commented 1 year ago

Done! archive is now 10.5281/zenodo.8015143

bmcfee commented 1 year ago

@ajakef :wave: had a chance to look into the bib entries yet? I think this is the only thing holding us up at this point. No rush, but let me know when it's ready for a final pass.

ajakef commented 1 year ago

Thanks @bmcfee, your citation fix worked. I also made minor formatting changes and added a couple references that didn't exist at the time of submission; hope that's ok. From my point of view it's ready to proceed.

bmcfee commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1785/0220170067 is OK
- 10.1029/2021gl093013 is OK
- 10.1785/0220180038 is OK
- 10.2172/1829264 is OK
- 10.1007/s00445-022-01607-y is OK
- 10.1093/gji/ggy069 is OK
- 10.17615/nseg-wy10 is OK
- 10.1029/2021ea002036 is OK
- 10.1088/1749-4699/8/1/014003 is OK
- 10.1175/jtech-d-19-0175.1 is OK
- 10.2172/1863279 is OK

MISSING DOIs

- 10.22541/essoar.167397475.53275976/v1 may be a valid DOI for title: Whitewater Sound Dependence on Discharge and Wave Configuration at an Adjustable Wave Feature

INVALID DOIs

- None
bmcfee commented 1 year ago

Thanks @ajakef - please update the cite with the missing DOI noted above, and I'll move on to the final pass.

ajakef commented 1 year ago

@bmcfee Just checking that I should use that DOI, even though it's just ESSOAR (i.e., a preliminary, non-peer-reviewed manuscript)? The article being cited is currently in review and there's no DOI from the journal yet.

bmcfee commented 1 year ago

@ajakef Yes - I think in that case, it's arguably more important to cite the DOI of the version that exists at the time of this submission, just to prevent confusion later on. (I expect the contents of the cited work won't change substantially upon review later on, but that's somewhat beside the point.)

ajakef commented 1 year ago

@bmcfee Done!

bmcfee commented 1 year ago

@editorialbot generate pdf

editorialbot commented 1 year ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

bmcfee commented 1 year ago

@editorialbot check references

editorialbot commented 1 year ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1785/0220170067 is OK
- 10.1029/2021gl093013 is OK
- 10.1785/0220180038 is OK
- 10.2172/1829264 is OK
- 10.1007/s00445-022-01607-y is OK
- 10.1093/gji/ggy069 is OK
- 10.17615/nseg-wy10 is OK
- 10.1029/2021ea002036 is OK
- 10.22541/essoar.167397475.53275976/v1 is OK
- 10.1088/1749-4699/8/1/014003 is OK
- 10.1175/jtech-d-19-0175.1 is OK
- 10.2172/1863279 is OK

MISSING DOIs

- None

INVALID DOIs

- None
bmcfee commented 1 year ago

Thanks @ajakef , looks great to me!

bmcfee commented 1 year ago

@editorialbot recommend-accept

editorialbot commented 1 year ago
Attempting dry run of processing paper acceptance...