openjournals / jose-reviews

Reviews for the Journal of Open Source Education (JOSE)
http://jose.theoj.org
Creative Commons Zero v1.0 Universal
33 stars 4 forks source link

[REVIEW]: An open learning resource on Reproducible Data Science with Open-Source Python Tools and Real-World Data #156

Closed whedon closed 1 year ago

whedon commented 2 years ago

Submitting author: @valdanchev (Valentin Danchev) Repository: https://github.com/valdanchev/reproducible-data-science-python Version: v2.1.2 Editor: @ShanEllis Reviewer: @TomDonoghue, @lechten Archive: 10.5281/zenodo.6895578

:warning: JOSE reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSE is currently operating in a "reduced service mode".

Status

status

Status badge code:

HTML: <a href="https://jose.theoj.org/papers/3e1de7c74161a5b2c4ce74e536ef6898"><img src="https://jose.theoj.org/papers/3e1de7c74161a5b2c4ce74e536ef6898/status.svg"></a>
Markdown: [![status](https://jose.theoj.org/papers/3e1de7c74161a5b2c4ce74e536ef6898/status.svg)](https://jose.theoj.org/papers/3e1de7c74161a5b2c4ce74e536ef6898)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@TomDonoghue & @lechten, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/jose-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @ShanEllis know.

Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest

Review checklist for @TomDonoghue

Conflict of interest

Code of Conduct

General checks

Documentation

Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)

JOSE paper

Review checklist for @lechten

Conflict of interest

Code of Conduct

General checks

Documentation

Pedagogy / Instructional design (Work-in-progress: reviewers, please comment!)

JOSE paper

whedon commented 2 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @TomDonoghue it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/jose-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/jose-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 2 years ago

Wordcount for paper.md is 1603

whedon commented 2 years ago
Software report (experimental):

github.com/AlDanial/cloc v 1.88  T=1.30 s (88.3 files/s, 108947.1 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                            22           1735            128          37365
SVG                              6              0             13           9323
JavaScript                      16           2342           2391           8757
Jupyter Notebook                32              0          71448           2037
Python                          13           1433            546           1963
CSS                             14            237             96           1321
Markdown                         7             82              0            251
TeX                              2             23              0            238
YAML                             2             16             16             52
JSON                             1              0              0              1
-------------------------------------------------------------------------------
SUM:                           115           5868          74638          61308
-------------------------------------------------------------------------------

Statistical information for the repository '75c25ff7f7e841b795a1a850' was
gathered on 2022/01/12.
The following historical commit information, by author, was found:

Author                     Commits    Insertions      Deletions    % of changes
valdanchev                       5         17466             34          100.00

Below are the number of rows from each author that have survived and are still
intact in the current revision:

Author                     Rows      Stability          Age       % in comments
Valentin Danchev          17432          100.0          0.0               16.85
whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

whedon commented 2 years ago
Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1162/99608f92.cb0fa8d2 is OK
- 10.17226/25104 is OK
- 10.1145/3408877.3432586 is OK
- 10.1371/journal.pcbi.1008770 is OK
- 10.1080/09332480.2019.1579578 is OK
- 10.21105/joss.03021 is OK
- 10.18148/srm/2020.v14i2.7746 is OK
- 10.1371/journal.pcbi.1007007 is OK

MISSING DOIs

- 10.1038/d41586-018-07196-1 may be a valid DOI for title: Why Jupyter is data scientists’ computational notebook of choice
- 10.25080/majora-92bf1922-00a may be a valid DOI for title: Data Structures for Statistical Computing in Python
- 10.25080/majora-92bf1922-011 may be a valid DOI for title: statsmodels: Econometric and statistical modeling with python

INVALID DOIs

- None
TomDonoghue commented 2 years ago

I've worked through my review, including exploring through the content, and checking I can run some examples through Binder etc. As a note I did skim the notebooks, and read sections, but I did do a full read of the content. I've checked off most things above, with the remainer being left open for small potential updates detailed here.

I have some practical suggestions, that I have described in detail on the resource repository, including:

Other notes:

In terms of the literal content - I don't have as much to say. Based on skimming it all, I think all the content (ideas and concepts) look good to me. It's a bit tricky for me to know exactly how an unfamiliar student would feel about the progression through things. My sense is perhaps that there is a lot introduced quite quickly, and one might need a bit of a sense of basic coding to really be able to jump in - but under the idea of it as a "crash course", I think it broadly works, and is a useful resource to have available.

My one more practical content note is that I think the end_to_end_data_science_project, which in particular feels like jumping in the deep end, could be more explicit that it skims through topics as an introduction (it effectively overviews the course), and that these topics are dealt with in more detail in future notebooks, and that it might make sense for students "jump ahead" to practice these topics there first (at first I even thought maybe this notebook was supposed to come at the end?).

Overall, I think this looks like an interesting resource, with it's strengths being that it:

Conclusion: I think if the relatively minor notes I've mentioned can be fixed up a bit, my overall impression is that this is a useful resource, responsive to JOSE's aims and requirements, and after the review edits I'd be happy to sign off on it as a reviewer.

whedon commented 2 years ago

:wave: @TomDonoghue, please update us on how your review is going (this is an automated reminder).

valdanchev commented 2 years ago

Thank you so much for these very helpful comments @TomDonoghue. Will go through all of your notes and issues and will write back when I have a revised version that addresses them.

lechten commented 2 years ago

I agree with all points raised by @TomDonoghue above. In summary, this is a valuable and ambitious course/resource.

I’d like to emphasize that I also believe the first notebook to need more descriptions: How might this be presented to students? What is expected of them? What should they be able to do?

For other comments, I opened issues in the source repository.

valdanchev commented 2 years ago

Thanks so much, @lechten for your very helpful comments and suggestions for improvements. The pointer to nbqa is extremely useful. My teaching-heavy term will be over in a few weeks, and I am looking forward to then revising the resource based on your and @TomDonoghue comments.

ShanEllis commented 2 years ago

Thanks to both reviewers!

@valdanchev Just checking in on your current expected timeline? (I totally understand the chaos that is the end of teaching-heavy terms...so not rushing just wanted to get a possible timeline established.)

valdanchev commented 2 years ago

Totally understand @ShanEllis, I'll plan to finalise all the revisions by the end of April if that sounds good.

valdanchev commented 2 years ago

@ShanEllis thank you for organising the review process, @TomDonoghue and @lechten thank you for your very helpful and constructive feedback and suggestions. The review process was very valuable for improving the resource, repository, and paper. Sorry for my delayed revisions.

I have now made the revisions and addressed all the feedback and comments above as well as the issues #5, #6, #7, #8, #9, #10. Specifically,

I've shortened the title of the resource to "Reproducible Data Science with Python" and would like to also shorten the title of the paper to "Reproducible Data Science with Python: An open learning resource".

Thank you again for the very constructive and transparent review process and feedback.

lechten commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

PDF failed to compile for issue #156 with the following error:

 pandoc: paper.bib: openBinaryFile: does not exist (No such file or directory)
Looks like we failed to compile the PDF
lechten commented 2 years ago

@whedon commands

whedon commented 2 years ago

Here are some things you can ask me to do:

# List Whedon's capabilities
@whedon commands

# List of editor GitHub usernames
@whedon list editors

# List of reviewers together with programming language preferences and domain expertise
@whedon list reviewers

EDITORIAL TASKS

# Compile the paper
@whedon generate pdf

# Compile the paper from alternative branch
@whedon generate pdf from branch custom-branch-name

# Ask Whedon to check the references for missing DOIs
@whedon check references

# Ask Whedon to check repository statistics for the submitted software
@whedon check repository
lechten commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

PDF failed to compile for issue #156 with the following error:

 /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:456:in `parse': (da632d62dda40da88c8646e4/paper/paper.md): mapping values are not allowed in this context at line 2 column 45 (Psych::SyntaxError)
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:456:in `parse_stream'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:390:in `parse'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:277:in `load'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:578:in `block in load_file'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:577:in `open'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:577:in `load_file'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:127:in `load_yaml'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:87:in `initialize'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `new'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `set_paper'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:66:in `prepare'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:131:in `<top (required)>'
    from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `load'
    from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `<main>'
valdanchev commented 2 years ago

@whedon generate pdf

lechten commented 2 years ago

@whedon generate pdf

lechten commented 2 years ago

@ShanEllis Whedon does not seem to work any more. Are you able to generate a new PDF?

TomDonoghue commented 2 years ago

I have checked through the updates to the resource and everything looks good to me! I have checked of my remaining review items, and as far as I'm concerned, this is good to go!

Quick note: as things get wrapped up, I think the version number that is listed in this issue (1.0.0) needs to be updated to reflect the version number listed in the most recent release (I think there is a whedon command to do this).

valdanchev commented 2 years ago

@whedon set v2.0.0 as version

whedon commented 2 years ago

I'm sorry @valdanchev, I'm afraid I can't do that. That's something only editors are allowed to do.

valdanchev commented 2 years ago

Many thanks, @TomDonoghue. On the version number, it seems that only editors can do the update. @ShanEllis will know best how to update the version number.

lechten commented 2 years ago

@whedon generate pdf

ShanEllis commented 2 years ago

Sorry y'all! Recently had a baby and am just super behind on things. Just wanted to reply here to say that I'm looking into what's going on re: PDF generation (after which I can update the version).

Will update/tag others as there's progress!

ShanEllis commented 2 years ago

@valdanchev I'm not 100% sure, but I believe the PDF won't compile due to YAML formatting issues at the top of paper.md. Specifically, for each bullet point entry, it looks to me like a space is missing before each dash (-).

Sharing a link to a random accepted paper whose YAML formatting looks right to me in case helpful: https://raw.githubusercontent.com/americocunhajr/EPIDEMIC/master/paper/paper.md

Mind making these edits and trying to recompile PDF? (If this isn't the issue I'll dig deeper.)

valdanchev commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

PDF failed to compile for issue #156 with the following error:

 /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:456:in `parse': (4d4bd54800247dd4827f3cec/paper/paper.md): did not find expected '-' indicator while parsing a block collection at line 15 column 2 (Psych::SyntaxError)
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:456:in `parse_stream'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:390:in `parse'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:277:in `load'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:578:in `block in load_file'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:577:in `open'
    from /app/vendor/ruby-2.6.6/lib/ruby/2.6.0/psych.rb:577:in `load_file'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:127:in `load_yaml'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon.rb:87:in `initialize'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `new'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/lib/whedon/processor.rb:38:in `set_paper'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:66:in `prepare'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/command.rb:27:in `run'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/invocation.rb:126:in `invoke_command'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor.rb:387:in `dispatch'
    from /app/vendor/bundle/ruby/2.6.0/gems/thor-0.20.3/lib/thor/base.rb:466:in `start'
    from /app/vendor/bundle/ruby/2.6.0/bundler/gems/whedon-c5c16aedb3d6/bin/whedon:131:in `<top (required)>'
    from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `load'
    from /app/vendor/bundle/ruby/2.6.0/bin/whedon:23:in `<main>'
valdanchev commented 2 years ago

@whedon generate pdf

ShanEllis commented 2 years ago

I see that it's still not generating....but also not throwing an error. Pinging people who know the system better than I to see what's going on/how to fix.

ShanEllis commented 2 years ago

@whedon generate pdf

labarba commented 2 years ago

@openjournals/dev – S.O.S. We're stumped with a paper compilation error here. Can you help?

arfon commented 2 years ago

This is probably because you have three paper.md files in your repo, and @whedon will just pick the first by default:

1. tmp/156/_build/html/_sources/JOSE_paper/paper.md
2. tmp/156/_build/html/_sources/paper/paper.md
3. tmp/156/paper/paper.md

The third one (at tmp/156/paper/paper.md) seems to compile fine. I would suggest either removing or renaming the first two (to not be called paper.md).

valdanchev commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

valdanchev commented 2 years ago

Thanks a million, @arfon. That was the problem indeed; removed the unnecessary files and the file compiles now.

valdanchev commented 2 years ago

@whedon generate pdf

whedon commented 2 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

lechten commented 2 years ago

Given the new PDF, I just closed my final issue at the source repository. I vote to accept this paper.

ShanEllis commented 2 years ago

@whedon set v2.0.0 as version

whedon commented 2 years ago

OK. v2.0.0 is the version.

ShanEllis commented 2 years ago

Just did a final read-through of the manuscript and a review of all the issues that have been resolved. Would like to thank @TomDonoghue and @lechten for their careful review of this manuscript and @valdanchev for being so responsive to their feedback.

(And, thanks for hanging in there with me as I've been a tad slower getting back to things here than I typically would be.)

ShanEllis commented 2 years ago

@whendon recommend-accept

valdanchev commented 2 years ago

Thank you @ShanEllis, @TomDonoghue, and @lechten for the very constructive and helpful review process!

ShanEllis commented 2 years ago

@whedon recommend-accept

whedon commented 2 years ago

No archive DOI set. Exiting...

labarba commented 2 years ago

@valdanchev — As this review is finalized, please make an archive of the target repository on Zenodo, and report the archive DOI here. We request that authors edit the metadata of the Zenodo deposit so title and author list match the JOSE paper.