[REVIEW]: FAIRmaterials: Ontology Tools with Data FAIRification in Development

editorialbot commented 1 month ago

Submitting author: !--author-handle-->@Alexhb02@atrisovic<!--end-editor-- Reviewers: @berquist, @emanueledelsozzo Archive: Pending

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/b38d892fdc407c82379ea4f164110674"><img src="https://joss.theoj.org/papers/b38d892fdc407c82379ea4f164110674/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/b38d892fdc407c82379ea4f164110674/status.svg)](https://joss.theoj.org/papers/b38d892fdc407c82379ea4f164110674)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@berquist & @emanueledelsozzo, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review. First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @atrisovic know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @emanueledelsozzo

📝 Checklist for @berquist

editorialbot commented 1 month ago

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot commented 1 month ago

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

✅ OK DOIs

- 10.1038/sdata.2016.18 is OK
- 10.1007/978-3-319-17966-7_21 is OK
- 10.32614/CRAN.package.DiagrammeR is OK
- 10.1145/2757001.2757003 is OK
- 10.32614/CRAN.package.rdflib is OK

🟡 SKIP DOIs

- No DOI given, and none found for title: Graphviz Python Package
- No DOI given, and none found for title: RDFLib: Python Library for working with RDF
- No DOI given, and none found for title: PyPI: The Python Package Index
- No DOI given, and none found for title: CRAN: The Comprehensive R Archive Network

❌ MISSING DOIs

- None

❌ INVALID DOIs

- None

editorialbot commented 1 month ago

Software report:

github.com/AlDanial/cloc v 1.90  T=7.97 s (444.2 files/s, 183560.1 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
Python                         3271         177729         266453         817403
C                                11           4228          17439         104274
CSV                              43              0              0          37450
C/C++ Header                     28           1277           2142           7605
SVG                              13              0            310           5286
CSS                              34            631            591           3623
Cython                           11            484            435           2223
HTML                              4            356              8           1711
JavaScript                        7            187            131           1440
Fortran 90                       53            116             86            892
R                                 4            169            298            797
Markdown                          9            150              0            404
Fortran 77                       21             26             50            382
reStructuredText                  4             91              1            294
TeX                               6             53             93            253
C++                               1             13             14            143
PowerShell                        1             49             90            108
Meson                             3             21              9            102
XML                               2              0              1             79
YAML                              1              0              0             48
Fish Shell                        1             13             14             42
INI                               3              5              0             34
Rmd                               3            487           3528             33
C Shell                           1             10              5             12
Bourne Again Shell                1              1              3             10
zsh                               1              1              6              7
Lua                               1              0              1              2
JSON                              1              0              0              1
--------------------------------------------------------------------------------
SUM:                           3539         186097         291708         984658
--------------------------------------------------------------------------------

Commit count by author:

    12  Jonathan Gordon
     1  Jonathan-E-Gordon

editorialbot commented 1 month ago

Paper file info:

📄 Wordcount for paper.md is 1391

✅ The paper includes a Statement of need section

editorialbot commented 1 month ago

License info:

✅ License found: BSD 3-Clause "New" or "Revised" License (Valid open source OSI approved license)

editorialbot commented 1 month ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

emanueledelsozzo commented 1 month ago

Review checklist for @emanueledelsozzo

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/cwru-sdle/FAIRmaterials?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[ ] Contribution and authorship: Has the submitting author (@Alexhb02) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
[ ] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[ ] Installation: Does installation proceed as outlined in the documentation?
[ ] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[ ] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
[ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
[ ] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
[ ] Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

[ ] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[ ] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

berquist commented 1 month ago

Review checklist for @berquist

Conflict of interest

[x] I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

[x] I confirm that I read and will adhere to the JOSS code of conduct.

General checks

[x] Repository: Is the source code for this software available at the https://github.com/cwru-sdle/FAIRmaterials?
[x] License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
[ ] Contribution and authorship: Has the submitting author (@Alexhb02) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
- According to Git history, only one author (Jonathan E. Gordon) has contributed to the software.
- See https://github.com/openjournals/joss-reviews/issues/7287#issuecomment-2395149727 for an explanation
- Waiting for editor guidance
[ ] Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
- Waiting for editor guidance
[x] Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
[x] Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
[x] Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

[ ] Installation: Does installation proceed as outlined in the documentation?
- Python: Installed via python -m venv ./venv; source ./venv/bin/activate; python -m pip install -e .
- R: Still need to check
[ ] Functionality: Have the functional claims of the software been confirmed?
[x] Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

[ ] A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
[ ] Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
[ ] Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
- Not clear; lots of R code present in Python-side README
[ ] Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
- Possible README updates: https://github.com/cwru-sdle/FAIRmaterials/pull/3
[ ] Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
- In venv, install pytest with coverage checking via python -m pip install pytest-cov
- python -m pytest --cov=FAIRmaterials fails, see https://github.com/cwru-sdle/FAIRmaterials/issues/2
- No use of continuous integration on GitHub repository
[ ] Community guidelines: Are there clear guidelines for third parties wishing to 1. Contribute to the software 2. Report issues or problems with the software 3. Seek support

Software paper

[ ] Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
[ ] A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
[ ] State of the field: Do the authors describe how this software compares to other commonly-used packages?
[ ] Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
[ ] References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?
- Request for references: https://github.com/openjournals/joss-reviews/issues/7287#issuecomment-2395138042

berquist commented 1 month ago

Software report:

This is misleading because a Python virtualenv was added to version control; it will need to be rerun once https://github.com/cwru-sdle/FAIRmaterials/pull/1 is merged.

Jonathan-E-Gordon commented 1 month ago

@berquist I completed the merge. Are they any other actions I need to take for now?

berquist commented 1 month ago

@berquist I completed the merge. Are they any other actions I need to take for now?

Thank you for the poke. There's nothing with the code, so I can continue with my review. I do have some general questions.

For authorship, there are a number of authors on the paper, but according to Git, you are the only person who has contributed to the code. Could you elaborate on this?

In the absence of running editorialbot (maybe there is a command for this) here are new LOC counts:

$ cloc .
      54 text files.
      35 unique files.
      64 files ignored.

github.com/AlDanial/cloc v 2.02  T=0.07 s (522.2 files/s, 162150.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
HTML                             1            338              5           1607
SVG                              2              0            299           1483
R                                4            169            298            797
Python                           9            212            287            684
Markdown                         4            112              0            271
CSV                             10              0              0            145
TeX                              1              8              0             96
Rmd                              3            487           3528             33
Text                             1              4              0              5
-------------------------------------------------------------------------------
SUM:                            35           1330           4417           5121
-------------------------------------------------------------------------------

I also ran it on just the Python implementation, so no tests, and came up with

Python                           5            172            276            425

I don't necessarily view this as a measure of novelty, particular when there's an obvious need for the software, but in reading the paper, it's missing references to prior work in the area. I've definitely seen tools that take ontologies defined in CSV or Excel workbooks and convert them to RDF. The paper states that the tooling is for materials and data science ontologies, but aside from this initial mention, there are no references to the existing and surprisingly rich history of ontologies in materials science. I've found at least https://link.springer.com/article/10.1557/s43579-024-00616-6 and https://link.springer.com/article/10.1557/s43580-024-00874-5 that claim usage of FAIRmaterials, so they need to be added.

There are some other criteria that may not be met under "substantial scholarly criteria", particularly age of software, though I get the impression it was simply not version controlled until recently. @atrisovic

Jonathan-E-Gordon commented 1 month ago

@berquist Great thank you! I will look into adding the references.

In terms of the "substantial scholarly criteria" and the authorship you are correct we do not work in github. Due to lab policy, we developed the software in a private Bitbucket repository, which has accumulated over 400 commits since 2021. In January 2024, we initiated a redesign of the software, resulting in 110 commits from 9 authors, representing approximately 7 person-months of work. I can send the whole commit history if that is helpful. This software will also be cited in a Scientific Data paper currently under review, titled “Materials Data Science Ontology (MDS-Onto): Unifying Domain Knowledge in Materials and Applied Data Science”.

danielskatz commented 3 weeks ago

@atrisovic - Can you update me on the status of this review and what the next steps and/or blocking items are?

berquist commented 2 weeks ago

Apologies for letting my review slip.

I don't think JOSS has a policy about making private code public, but since the review should be in the open, I wouldn't request to see git log > git_log.txt unless you were willing to post it publicly as an attachment in this comment thread. At this point I'm still looking for editor guidance.

I've created an issue on the repo about the failing Python tests and a PR for discussing how the READMEs should be handled, since there is no other project documentation or examples.

Although the READMEs do cover an example of using the code from both Python and R, it would be much better to have this fully-worked example present in the repository. I think it's present in the R package but not the Python package.

Jonathan-E-Gordon commented 2 weeks ago

@editorialbot commands

editorialbot commented 2 weeks ago

Hello @Jonathan-E-Gordon, here are the things you can ask me to do:


# List all available commands
@editorialbot commands

# Get a list of all editors's GitHub handles
@editorialbot list editors

# Adds a checklist for the reviewer using this command
@editorialbot generate my checklist

# Set a value for branch
@editorialbot set joss-paper as branch

# Run checks and provide information on the repository and the paper file
@editorialbot check repository

# Check the references of the paper for missing DOIs
@editorialbot check references

# Generates the pdf paper
@editorialbot generate pdf

# Generates a LaTeX preprint file
@editorialbot generate preprint

# Get a link to the complete list of reviewers
@editorialbot list reviewers

openjournals / joss-reviews