openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
694 stars 36 forks source link

[REVIEW]: gobbli: A uniform interface to deep learning for text in Python #2395

Closed whedon closed 3 years ago

whedon commented 4 years ago

Submitting author: @jasonnance (Jason Nance) Repository: https://github.com/RTIInternational/gobbli Version: v0.2.0 Editor: @arfon Reviewers: @w4ngatang, @ljvmiranda921, @sisco0
Archive: 10.5281/zenodo.3406400

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/b4b2801df971ef6bdda31176fe7e6fbc"><img src="https://joss.theoj.org/papers/b4b2801df971ef6bdda31176fe7e6fbc/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/b4b2801df971ef6bdda31176fe7e6fbc/status.svg)](https://joss.theoj.org/papers/b4b2801df971ef6bdda31176fe7e6fbc)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@w4ngatang, @ljvmiranda921, & @sisco0, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @bmcfee know.

Please try and complete your review in the next six weeks

Review checklist for @w4ngatang

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @ljvmiranda921

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @sisco0

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 4 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @w4ngatang, @thomwolf it looks like you're currently assigned to review this paper :tada:.

:warning: JOSS reduced service mode :warning:

Due to the challenges of the COVID-19 pandemic, JOSS is currently operating in a "reduced service mode". You can read more about what that means in our blog post.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 4 years ago

PDF failed to compile for issue #2395 with the following error:

Can't find any papers to compile :-(

W4ngatang commented 4 years ago

Hi @jasonnance,

Just to confirm, the paper is here? I see that it's currently in a PR. Is it ready for review, or should we wait?

Best, Alex

jasonnance commented 4 years ago

Hi @W4ngatang, yep, that's it. It's ready for review -- I was hoping to finish the review on that branch before merging into master. I can go ahead and merge if it makes things easier for reviewers, though.

labarba commented 3 years ago

👋 hi everybody!

It looks like this review was progressing, but activity stopped a few weeks ago. Can we have an update from everyone involved? @W4ngatang, @thomwolf : what is the status of your reviews? If you need more time, just let us know—times are tough!

kthyng commented 3 years ago

@whedon generate paper from branch joss-paper

whedon commented 3 years ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@whedon commands
kthyng commented 3 years ago

@whedon generate pdf from branch joss-paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch joss-paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

kthyng commented 3 years ago

Hi @W4ngatang, @thomwolf! I have generated the pdf for this submission above, so you should have what you need to work on your review. Please reach out to @bmcfee or the handle joss-eics if you have any questions (which I am not pinging currently because it would ping me and I am already here and others who are not currently on duty).

kthyng commented 3 years ago

@W4ngatang, @thomwolf — how are reviews coming along? Do you have any questions?

thomwolf commented 3 years ago

This totally dropped out of my mind, sorry, I'll try to do it very soon!

arfon commented 3 years ago

@whedon assign @arfon as editor

arfon commented 3 years ago

:wave: folks. @bmcfee is unable to continue editing this paper at the moment so I'm going to pick it up from here.

@W4ngatang, @thomwolf - please update me when you can on your progress. It would be good to get this submission wrapped up in the next month if possible.

thomwolf commented 3 years ago

I will have to defer this to someone in my team unfortunately.

It just seems totally not possible for me to do it in the coming 2 months at least with all the managements stuff I need to handle. And it's just keep getting bigger and bigger so no real solution in the very short term.

Otherwise if you prefer to ask another reviewer that you would chose yourself, I'm happy to recommend names.

arfon commented 3 years ago

Otherwise if you prefer to ask another reviewer that you would chose yourself, I'm happy to recommend names.

Thanks for the update @thomwolf. I think it might be best to seek out a new reviewer here. Any recommendations you might have would be very welcome!

thomwolf commented 3 years ago

For instance I think @nreimers and @JoPfeiff have deep knowledge of both OSS and science.

arfon commented 3 years ago

:wave: @nreimers and @JoPfeiff - would you be willing to step in on behalf of @thomwolf here and review this submission for JOSS?

We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html

The submission we'd be asking you to review is gobbli: A uniform interface to deep learning for text in Python.

nreimers commented 3 years ago

Hi @arfon I am currently on parental leave and sadly will not have time in the next months for reviewing the submission of JOSS.

Best Nils

arfon commented 3 years ago

:wave: @ionlights @GregaVrbancic @ankur-gupta – would you be willing to step in on behalf of @thomwolf here and review this submission for JOSS?

We carry out our checklist-driven reviews here in GitHub issues and follow these guidelines: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html

The submission we'd be asking you to review is gobbli: A uniform interface to deep learning for text in Python.

ankur-gupta commented 3 years ago

Yes, I can review. Thanks.

arfon commented 3 years ago

@whedon remove @thomwolf as reviewer

whedon commented 3 years ago

OK, @thomwolf is no longer a reviewer

arfon commented 3 years ago

@whedon add @ankur-gupta as reviewer

whedon commented 3 years ago

OK, @ankur-gupta is now a reviewer

arfon commented 3 years ago

@ankur-gupta - many thanks for agreeing to take over here!

Please read the "Reviewer instructions & questions" in the first comment above.

Both reviewers have checklists at the top of this thread (in that first comment) with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention https://github.com/openjournals/joss-reviews/issues/2395 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

ankur-gupta commented 3 years ago

@whedon I am sorry for not replying previously. I have been quite unwell. I apologize but I don't think I'd be able to do this review for some time. I don't want to leave people waiting on me. I am really sorry. I can remove my name from list of reviewers for the coming month or more. I would like a chance to review once again once I am fully back to my regular heath.

whedon commented 3 years ago

I'm sorry human, I don't understand that. You can see what commands I support by typing:

@whedon commands
ankur-gupta commented 3 years ago

@arfon I am sorry for not replying previously. I have been quite unwell. I apologize but I don't think I'd be able to do this review for some time. I don't want to leave people waiting on me. I am really sorry. I can remove my name from list of reviewers for the coming month or more. I would like a chance to review once again once I am fully back to my regular heath.

arfon commented 3 years ago

OK, thanks for letting me know @ankur-gupta - we'll go looking for a second reviewer to take over from you here. I hope you're feeling better soon!

arfon commented 3 years ago

@jasonnance - thanks for your ongoing patience here! I've just contacted a couple of potential new reviewers over email, and have also asked @W4ngatang to complete their review soon if they can.

arfon commented 3 years ago

@whedon re-invite @ljvmiranda921 as reviewer

whedon commented 3 years ago

OK, the reviewer has been re-invited.

@ljvmiranda921 please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

arfon commented 3 years ago

@ljvmiranda921 - many thanks for agreeing to take over here!

Please read the "Reviewer instructions & questions" in the first comment above.

Both reviewers have checklists at the top of this thread (in that first comment) with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention #2395 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

ljvmiranda921 commented 3 years ago

Sounds good, thanks for facilitating @arfon ! Will start my review by next week 👍

arfon commented 3 years ago

@whedon re-invite @sisco0 as reviewer

whedon commented 3 years ago

OK, the reviewer has been re-invited.

@sisco0 please accept the invite by clicking this link: https://github.com/openjournals/joss-reviews/invitations

arfon commented 3 years ago

@sisco0 - many thanks for agreeing to take over here!

Please read the "Reviewer instructions & questions" in the first comment above.

Both reviewers have checklists at the top of this thread (in that first comment) with the JOSS requirements. As you go over the submission, please check any items that you feel have been satisfied. There are also links to the JOSS reviewer guidelines.

The JOSS review is different from most other journals. Our goal is to work with the authors to help them meet our criteria instead of merely passing judgment on the submission. As such, the reviewers are encouraged to submit issues and pull requests on the software repository. When doing so, please mention #2395 so that a link is created to this thread (and I can keep an eye on what is happening). Please also feel free to comment and ask questions on this thread. In my experience, it is better to post comments/questions/suggestions as you come across them instead of waiting until you've reviewed the entire package.

We aim for the review process to be completed within about 4-6 weeks but please make a start well ahead of this as JOSS reviews are by their nature iterative and any early feedback you may be able to provide to the author will be very helpful in meeting this schedule.

arfon commented 3 years ago

Great news everyone, @sisco0 has also kindly agreed to review here so between @W4ngatang, @ljvmiranda921, & @sisco0 I'm hoping we can move this forward in the coming weeks.

sisco0 commented 3 years ago

The current document shows comments on each one of the Checklist points for https://github.com/openjournals/joss-reviews/issues/2395. After reviewing the solution, I could say that it seems a promising solution for allowing researchers to integrate into current development pipelines. This tool could provide a fast way for comparing different solutions while easing the task of model configuration and data augmentation duties.

Functionality

Installation: Does installation proceed as outlined in the documentation?

The package could be installed by using pip as usual. An environment has been set up by the use of Pipenv, where a Pipfile was created at the root folder. It would be desirable to have Poetry for the development project to take care of the dependencies. The development project has been installed as well, and CI tests could be run without any inconvenience.

Functionality: Have the functional claims of the software been confirmed?

Source code containing augmentation algorithms as listed in the paper are enumerated below with references: Back translation https://github.com/RTIInternational/gobbli/blob/2f4f6abb3a3a9635afb5ae625010b1ad60489bbc/gobbli/augment/marian/model.py#L14 Word replacement https://github.com/RTIInternational/gobbli/blob/2f4f6abb3a3a9635afb5ae625010b1ad60489bbc/gobbli/augment/word2vec.py Wordnet, which represents a Contextual Augmentation implementation https://github.com/RTIInternational/gobbli/blob/2f4f6abb3a3a9635afb5ae625010b1ad60489bbc/gobbli/augment/wordnet.py#L44 The classification experiment at https://gobbli.readthedocs.io/en/latest/quickstart.html#classification-experiment was driven successfully. As a drawback, no output is returned in the console, and the application stops in less than 10 seconds. That behaviour could make people think about if the application ran successfully or not. This test confirmed the ease of a solution to be built on top of this Python package.

Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

There are no performance claims for this project. It shows the option of running under GPU, which could be accomplished by using GPU-enabled docker configurations.

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

This point is clearly pointed out in the PDF, where it is stated than: gobbli is designed to emphasize simplicity and interoperability rather than customization and performance in order to make deep learning more accessible to applied researchers. gobbli is a Python library intended to bridge state-of-the-art research in natural language processing and application to real-world problems.

Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.

The list of dependencies is clear for the end users, and for the developers, as different requirements.txt files hav been found out in the repository, and these are referenced in the main README.md file.

Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

Different examples are shown at https://gobbli.readthedocs.io/en/latest/quickstart.html#high-level-api-experiments. They represent real problems simplifications, but the source code for real problems would be moreover the same.

Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?

API documentation is highly embedded in the current code, where the documentation is automatically generated as well. It could be seen at https://gobbli.readthedocs.io/en/latest/api.html that there exists an index showing each component from the current solution. As an example, it could be seen at https://gobbli.readthedocs.io/en/latest/auto/gobbli.augment.html#module-gobbli.augment that there is a high detail level on this part of the project. Congratulations.

Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?

CI exists and it is runnable. I could see a point against the maintenance of the project and new developers to integrate into the project that running the CI takes high memory requirements. It should be researched if CI could be lighter.

Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support.

There exists a CONTRIBUTING.md file, which currently holds the CLA Link to be signed by contributors and show instructions on how to contribute with Code Style and Testing. In that same document, it is specified that Code Reviews would be based on Pull Requests that attend one point at a time. There exists a template for Bug Reports at https://github.com/RTIInternational/gobbli/blob/2f4f6abb3a3a9635afb5ae625010b1ad60489bbc/.github/ISSUE_TEMPLATE/bug-report.md I could not find any Seek Support part in the README.md, but there exist help commands on each one of the CLI solutions provided and Github Issues are reachable.

Software paper

Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?

These points have been provided in the third paragraph as cited before.

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?

This is already pointed out at portions of the summary like: “A practitioner may therefore be required to learn a new programming language, a deep learning library...“ “This approach allows… without spending time adapting…” The target audience is pointed out in the phrase shown below: “gobbli is designed to emphasize simplicity and interoperability rather than customization and performance in order to make deep learning more accessible to applied researchers.”

State of the field: Do the authors describe how this software compares to other commonly-used packages?

They clearly state that this package could be compared to transformers and fastai.

Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?

The next suggestions are made for the correctness of the document:

I also propose the next change, taking into account the formality for this text:

jasonnance commented 3 years ago

@whedon generate pdf from branch joss-paper

whedon commented 3 years ago
Attempting PDF compilation from custom branch joss-paper. Reticulating splines etc...
whedon commented 3 years ago

:point_right::page_facing_up: Download article proof :page_facing_up: View article proof on GitHub :page_facing_up: :point_left:

jasonnance commented 3 years ago

Thanks, @sisco0! I've implemented your suggested changes and had the pdf regenerated.

sisco0 commented 3 years ago

Given the recent changes shown in the paper, my review is complete and a sign of approval is had. I hope the best for this professional and integrable project.

ljvmiranda921 commented 3 years ago

Hi everyone (@arfon @jasonnance ) ! I'm done reviewing this repository. I've checked all the boxes in the reviewer guidelines above. I approve this submission with my review and suggestions below:

Review Summary

The gobbli library provides a high-level NLP interface that eases the use and speeds-up experimentation in the field. Aside from abstracting away other libraries such as spacy, fastai, and transformers, it also provides helpful utilities for data augmentation, model evaluation, and exploration.

The paper is well-written, with a clear statement of need. It was also able to differentiate against other NLP libraries like huggingface/transformers, explosion/spacy, etc. that are commonly-found today. I see this as an all-in-one tool for NLP needs, and maybe useful in both research and production. In addition, I appreciate the inclusion of a streamlit UI for most-common NLP tasks, it simplifies the setup needed for researchers and practitioners alike.

The repository is well organized, with clear documentation and tests. It exists under the OSI-approved Apache-2.0 License with Contributing guidelines for the community. I tested all the functional claims from the README in my machine (Ubuntu 20.04) and they worked as expected.

Suggestions

This may not be necessary for publication (of course, with the editor's discretion). But may be useful for the improvement and sustainability of the project in the future:

In my opinion, the ones above aren't prerequisites for acceptance.

arfon commented 3 years ago

:wave: @jasonnance – just checking in here. Are you still in the process of making any changes in response to the reviewer feedback?

jasonnance commented 3 years ago

Hey @arfon, my time is limited at the moment, so I was only planning on implementing any changes required for acceptance. I think I've covered all of those, but let me know if I missed something.

arfon commented 3 years ago

@jasonnance – At this point could you make a new release of this software that includes the changes that have resulted from this review. Then, please make an archive of the software in Zenodo/figshare/other service and update this thread with the DOI of the archive? For the Zenodo/figshare archive, please make sure that:

I can then move forward with accepting the submission.