Closed whedon closed 4 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @BayesforDays, @sara-02 it looks like you're currently assigned to review this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews πΏ
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Reference check summary:
OK DOIs
- None
MISSING DOIs
- https://doi.org/10.18653/v1/d19-1670 may be missing for title: EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
- https://doi.org/10.3390/info11020108 may be missing for title: fastai: A Layered API for Deep Learning
- https://doi.org/10.3115/v1/d14-1162 may be missing for title: GloVe: Global Vectors for Word Representation
INVALID DOIs
- None
π @BayesForDays and @sara-02 βΒ feel free to leave any comments here you have about the manuscript and if you have code-related comments please use the repository's issues (https://github.com/deniederhut/niacin) and link to them from here. The main bulk of your review involves the list above. πΊ
π If you can both give a rough ETA for when you can complete your reviews that would be very helpful.
wave @BayesForDays and @sara-02 β feel free to leave any comments here you have about the manuscript and if you have code-related comments please use the repository's issues (https://github.com/deniederhut/niacin) and link to them from here. The main bulk of your review involves the list above. smiley_cat
point_right If you can both give a rough ETA for when you can complete your reviews that would be very helpful.
I should have my first set of reviews by March 15th.
@sara-02 great!
@whedon remind @sara-02 in 2 weeks
Reminder set for @sara-02 in 2 weeks
@deniederhut Will it be possible to include some sort of feature/architecture comparison between niachin, EDA and noisemix
?
This is w.r.t to the following parameter required regarding the paper
State of the field: Do the authors describe how this software compares to other commonly-used packages?
In the current paper, we are only mentioning the following
here are a small number of libraries that contain text augmentation functions, for example noisemix and EDA (Wei & Zou, 2019), but these tend to be associated with conference papers and have little documentation, or only implement a small subset of common transforms.
I think we can expand on this WDYT?
Will it be possible to include some sort of feature/architecture comparison
Sure thing π
Okay, I have added some more detail about the transformations, and included a table with the two libraries mentioned. Let me know what you think!
Dear authors and reviewers
We wanted to notify you that in light of the current COVID-19 pandemic, JOSS has decided to suspend submission of new manuscripts and to handle existing manuscripts (such as this one) on a "best efforts basis". We understand that you may need to attend to more pressing issues than completing a review or updating a repository in response to a review. If this is the case, a quick note indicating that you need to put a "pause" on your involvement with a review would be appreciated but is not required.
Thanks in advance for your understanding.
Arfon Smith, Editor in Chief, on behalf of the JOSS editorial team.
Okay, I have added some more detail about the transformations, and included a table with the two libraries mentioned. Let me know what you think!
Hey, where can I view the updated paper? I was earlier looking at the link provided here https://github.com/openjournals/joss-reviews/issues/2136#issuecomment-594364581
@sara-02 - you can ask @whedon for a preview at any time like this π
@whedon generate pdf
@oliviaguest I think I am done with the first set of reviews, and @deniederhut has already PRs w.r.t changes requested. Once his work gets merged, I can do a final review then. The rest looks good to me, and the project seems to be working as expected.
@sara-02, fantastic and thank you. Stay safe during these troubling times. πΈ
Sorry everyone for the delays in my review. I can wait until PRs are merged and do the review then, or provide my thoughts now?
@BayesForDays same for you β stay safe! Take your time. πΌ
@deniederhut if you can, and no pressure, feel free to give the reviewers some idea of what you plan to do. π·
Hey! Sorry for holding things up -- things have been a bit hectic. I've merged the PRs with the requested changes, and am presently addressing some tickets that @sara-02 created. Thanks so much for your help Sarah!
@BayesForDays this is ready for review whenever you are. No rush π
Hi @deniederhut, I'm checking out the paper right now and I had a couple of questions/comments. One concern is about the wordnet functionality -- and in particular your motivations for using nltk
to access wordnet. Do you think it makes sense to specify a specific version of nltk?
On a conceptual level, I think it would be helpful for readers if you could elaborate a bit on the wordnet functions (e.g. add_hypernyms
), as well as include a citation about whether these specific types of negative samples are useful for modeling. I personally think the wordform transformation is the most useful one and requires the least explanation, partly because it's also the most analogous to the computer vision work that I'm aware of.
In the paper, it would be helpful to somehow graphically mark the changes that take place when transforming the "original" text into the "augmented" text as well, especially for the hyper/hyponym example.
Finally, I think it makes sense (unless the package is more general) to specify that the language this package assumes is English. If the package can be made to be more general (e.g. different language versions of wordnet) specifying this somehow or allowing that to be passed in as a parameter would be a nice "to-do."
I have a handful of questions about the API, nothing major I think.
text
) would be very helpful. For example, it wasn't clear to me why text
has to be imported rather than it being easy to import a function like add_hypernyms
directly from niacin
. The text
import could also be clarified in the documentation and the paper (which seems to assume that the reader has done something like from niacin.text import *
). Hey @deniederhut I recently came across some more references that you can add to your literature review and introduction if they not already spoken about. https://github.com/makcedward/nlpaug https://github.com/QData/TextAttack Also, an interesting blog giving an overview of the work you aim to achieve: https://amitness.com/2020/05/data-augmentation-for-nlp/
Thanks for all the feedback @BayesForDays ! I've been working on adding some new transformations to flesh out the capabilities of the library a bit and those PRs should get merged in soon. I'll turn my attention to the issues you raised next (good catch on the nltk version btw -- it's in requirements.txt but not the setup file π¬). To address this one briefly here:
I think it makes sense (unless the package is more general) to specify that the language this package assumes is English
Totally. Cross-lingual support is hard, because some of these transformations don't really make sense for other languages, or don't have similar resources available. Maybe the best approach here is to have something like from niacin.text import en
-- which will have the added benefit of not loading in resources during runtime for a language that's not being used.
And thanks for the references @sara-02 ! That nlpaug library has some interesting stuff in it. I'm not sure I'll be able add similar functionality here in a timely fashion π
Okay, I think I have addressed the comments here. Let me know what you think!
@sara-02 I noticed that you closed the issue you raised about support for other languages. Do you think it would make sense to leave it open in case someone else has a similar question?
@deniederhut This is understandable, I just shared those packages as they seemed good reference point for future work. Oh, I thought we need to close any pending issues from our side for your package publication(not sure about this, but it looks clean that way). I will be more than happy to reopen it once we publish :)
@oliviaguest I think I are good with the final reviews from my side. I cloned and ran the tests on the updated package and they passed. @BayesForDays do let us know if you have anything else to add.
@BayesForDays are the last few unticked items above now covered?
@sara-02 thanks βΒ this is great. π
Yes, I think everything looks good!
@whedon check references
@whedon generate pdf
Reference check summary:
OK DOIs
- None
MISSING DOIs
- https://doi.org/10.18653/v1/d19-1670 may be missing for title: EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
- https://doi.org/10.3390/info11020108 may be missing for title: fastai: A Layered API for Deep Learning
- https://doi.org/10.3115/v1/d14-1162 may be missing for title: GloVe: Global Vectors for Word Representation
INVALID DOIs
- None
@deniederhut can you please create an archive for your code (on Zenodo, figshare, or other) and post the archive DOI here as well as check the PDF proof and add the missing DOIs, please? π
Sure thing! The DOIs should be in there now, and here's the zenodo DOI: http://doi.org/10.5281/zenodo.3875715.
@whedon check references
Reference check summary:
OK DOIs
- 10.18653/v1/D19-1670 is OK
- 10.3390/info11020108 is OK
- 10.3115/v1/D14-1162 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@whedon set 10.5281/zenodo.3875715 as archive
OK. 10.5281/zenodo.3875715 is the archive.
@whedon generate pdf
@deniederhut let me know if the final proof looks good and we can seal the deal and publish it. π
Looks good to me, thanks!
@whedon accept
Attempting dry run of processing paper acceptance...
Reference check summary:
OK DOIs
- 10.18653/v1/D19-1670 is OK
- 10.3390/info11020108 is OK
- 10.3115/v1/D14-1162 is OK
MISSING DOIs
- None
INVALID DOIs
- None
:wave: @openjournals/joss-eics, this paper is ready to be accepted and published.
Check final proof :point_right: https://github.com/openjournals/joss-papers/pull/1468
If the paper PDF and Crossref deposit XML look good in https://github.com/openjournals/joss-papers/pull/1468, then you can now move forward with accepting the submission by compiling again with the flag deposit=true
e.g.
@whedon accept deposit=true
@openjournals/joss-eics please accept this! π
Submitting author: @deniederhut (Dillon Niederhut) Repository: https://github.com/deniederhut/niacin Version: 0.2.0 Editor: @oliviaguest Reviewer: @BayesforDays, @sara-02 Archive: 10.5281/zenodo.3875715
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@BayesforDays & @sara-02, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @oliviaguest know.
β¨ Please try and complete your review in the next two weeks β¨
Review checklist for @BayesforDays
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @sara-02
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper