Closed whedon closed 4 years ago
Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @rlskoeser, @vc1492a it looks like you're currently assigned to review this paper :tada:.
:star: Important :star:
If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿
To fix this do the following two things:
For a list of things I can do to help you, just type:
@whedon commands
For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Overview: Of the JOSS reviews I have been a part of, this project has (to date) been the most well-documented and polished in terms of code readability and test coverage which is great to see. I think the author(s) did a good job on the software and in providing examples and documentation in order to get started easily. The software is easy to use and I have even shared the software with some of my colleagues who will be exploring its usability in some of our team's projects - I think it could have some utility! Very much appreciated the thorough documentation in the code and tests written to verify functionality as well as with how to contribute, and this will go a long way towards communicating the work but also inviting open-source contributions. I have included some commentary on checkboxes below.
Installation and Installation Instructions: The project documentation provide instructions for installation via PyPi or by cloning the Github repository. After examining both setup.py
, requirements.txt
, and the source code from the project repository and attempting an installation in my environment, the installation completed successfully however when using import graphtransliterator
the import failed due to marshmallow
being a undefined import. It is one of the project's dependencies and is noted in requirements.txt
, but it was not installed during installation as it was missing from a list of requirements in setup.py
. I have corrected this minor error and have provided the fix in this pull request. It may help to also have the project's dependencies noted somewhere in the documentation beyond requirements.txt
in the project repository, but I don't think it is a high priority. It may also be nice to list which specific versions of Python 3 have support and are tested for in readme.md
and/or in the documentation - in this case, Python versions 3.5 through 3.7. The version numbers are listed in the contribution guidelines, but that's the only location I see the Python versions mentioned outside of configuration files.
Functionality: I have verified the functionality presented through the examples presented in the documentation, and the software's functionality operates as the author describes. The ambiguity checking is a particularly nice feature which could perhaps be used as a verification check if automatically generated transliteration rules. I also liked how the author provided a means to expose the underlying graph of the transliteration rules to the user for examination - whether through visualization our outside analyses. I currently do not have my own use case in which to test the software outside of the examples presented in the documentation - the editor is free to determine whether a more in-depth look at the functionality would be beneficial as part of the review.
Automated Tests: I was able to run the automated tests easily, and the tests are well documented which allowed me to quickly understand their purpose and which functionality in the software they are designed to test. I did notice a bit of unused tests which I removed in the aforementioned pull request just to clean things up a little bit.
References: The author did a good job of pointing out related software with citation and noting the differences between GraphTransliterator and those software. I did however expect some citations for the following sentences:
It [transliteration] enables the standardized organization and search of resources, as in library
systems. It also permits the encoding of additional information, which enables disambiguation
and advanced linguistic analysis, including natural language processing tasks, that are often
not possible in the original script.
A reader which is new to the field may want to explore works in which transliteration is used in these areas - citations would provide this background and ground the paper/software under review with some real-world applications. Providing some references here would provide a more complete paper around the software and its use cases.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Thanks @vc1492a for your generous and helpful review!
I have made the following changes to the oode and documentation:
I have also added references and made changes to the first paragraph of the paper following your suggestion to better elaborate the use cases of transliteration.
I hope that resolves the issues. Thanks so much. (P.S. Would love to hear later about the potential usages!)
I have reviewed the software, documentation, and paper. The only checklist item I'm not able to sign off on is a statement of need included with the project documentation. As I understand it, a statement of need should be included in the readme or project documentation, but the only place I could find this was in the article. I suggest adapting the summary information from the beginning of the article.
Other things I noted:
...
included on the beginning of continuing lines. I tried to generate a file I could load using the from_yaml_file
method based on the annotated sections you provide in a tutorial. First I got an error because I included the short set of rules, which looks like it has a typo (PR opened). Now that I've corrected that, I'm getting a validation error that I'm not sure how to troubleshoot. (I can't tell if this is a discrepancy between the versions in the tutorial or something I've done wrong.)
marshmallow.exceptions.ValidationError: {'rules': ['Invalid token "," in tokens of rule TransliterationRule(production=\',\', prev_classes=None, prev_tokens=None, tokens=[\',\'], next_tokens=None, next_classes=None, cost=0.5849625007211562)']}
Perhaps you could include the complete yaml file as a link that could be downloaded for use with the tutorial, and then people could compare your working version with the annotated samples in the tutorial? That would also allow you to omit the full lists of tokens rules from the tutorial, which might make it easier to follow.
Forgot to say: this is a very well-documented and interesting tool! I haven't had the chance to work much with languages that require transliteration, but we're starting to work on a project involving Ethiopian Ge`ez. I haven't learned much about the language yet, but will keep this in mind if a use case comes up. It's important to put a statement of need on the readme so that people like me, who don't usually work with languages that require transliteration, will be able to understand immediately the use and power of your tool.
Thank you for these excellent corrections and comments, @rlskoeser! I will work on fixing them over the weekend and will let you know when I'm done.
Do let me know how it goes with Ethiopian Ge`ez. I am working on developing a standard for Urdu and some other South Asian languages that will generate the Perso-Arabic text, with or without the Arabic diacritics, and also output in the scholarly transliteration that we use in academic writing as well as in library of congress romanization format, which I need for an archive project.
I also want to figure out a good way to bundle YAML files to the module, too, so that people can contribute them. I might add a transliterators
module that reads from a directory of YAML files (or JSON) and can produce an accessible list and GraphTransliterator objects, maybe by setting __all__
programmatically in __init__.py
. That will require setting some standards for the metadata field. I may model it on the fields in setup.py. I am not sure if there is any standard for that sort of thing. If you have any thoughts about that, please let me know. Thanks!
Bundled transliterator configurations sounds valuable. Maybe you could define a subclass of the GraphTransliterator object that looks for a named config file in a specified path (or possibly multiple paths).
If/when you add that, you'll probably also want to update your contributing document to explain how to add a new transliterator configuration. You'll also want think about what kind of testing a bundled transliterator needs (probably at least as some sanity checking?).
I'm close to finishing these corrections. It will probably take at least another week.
I have added bundled transliterators requiring tests that cover all nodes and edges of graphs, as well as OnMatchRules, if applicable. I will have to update the essay, as well.
I will also be cleaning up the documentation and other points following the excellent recommendations of @rlskoeser.
This review process has been so helpful! Thanks, @vc1492a and @rlskoeser!
@seanpue: Thanks for your follow up and willingness to address all review feedback.
@vc1492a and @rlskoeser: Thanks for your thorough reviews. This feels like a model of how reviews should be done. Even though the work is clearly well-done and polished, there were minor issues raised to ensure the submission leads to a great outcome.
I feel optimistic this is heading toward acceptance and look forward to recommending same when @seanpue finishes making the minor revisions. Separately, I rarely jump in with any feedback before the reviewers do, but I felt pretty optimistic about this submission when I first saw it. It's nice to see this research area represented within JOSS.
Hi @seanpue how are your corrections coming along?
Hi @kthyng Almost done! Hopefully today.
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
Hi @vc1492a @rlskoeser @gkthiruvathukal
Ok, I think it's ready to go. I have added the statement of need, updated the article, and also added bundled transliterators, a tutorial on adding bundled transliterators, and a command line interface. I have used jupyter-sphinx so that the docs automatically generate the sample code, and the code can now be cut and pasted or downloaded.
:wave: Hey @seanpue...
Letting you know, @gkthiruvathukal
is currently OOO until Sunday, October 27th 2019. :heart:
Hi @vc1492a @rlskoeser @gkthiruvathukal Wanted to check to see if you've had a chance to review the updates. Thanks.
@seanpue It looks like we are nearing a conclusion here! It would be great if @vc1492a and @rlskoeser can chime in and take a final look. If I don't hear any new issues, say, by Monday, I think we are ready to move toward acceptance.
@seanpue must have missed the notification previously, apologies! I took a look at the repository and the updates look good to me. I feel that this work is ready for publication in JOSS.
Thanks again @vc1492a @rlskoeser Do you think you'll get a chance to review the changes? I believe I have addressed your concerns.
@seanpue See the above comment, I reviewed the changes 13 days ago and they look good to me.
Saw that but forgot the !. Thanks @vc1492a!
I apologize for the delay. Changes look good to me.
Hi @gkthiruvathukal I think this is all set!
@seanpue Thanks so much to you and our wonderful reviewers @rlskoeser and @vc1492a.
@openjournals/joss-eics I am ready to move on accepting this contribution!
@vc1492a it looks like you are happy with this submission, can you tick those last boxes?
@whedon check references
Attempting to check references...
OK DOIs
- None
MISSING DOIs
- https://doi.org/10.1007/s12046-018-0828-8 may be missing for title: Machine transliteration and transliterated text retrieval: a survey
- https://doi.org/10.18653/v1/w18-2409 may be missing for title: Report of NEWS 2018 Named Entity Transliteration Shared Task
INVALID DOIs
- None
@seanpue can you check those DOI's :point_up:
@seanpue this work is about to be processed for acceptance. Can you give the paper a final check yourself as well (in particular the author names and affiliations)?
@seanpue once you fixed/checked the above can you post an archived version of the software on Zenodo and report back the DOI of the archived version?
@whedon check references
Attempting to check references...
OK DOIs
- None
MISSING DOIs
- https://doi.org/10.1007/s12046-018-0828-8 may be missing for title: Machine transliteration and transliterated text retrieval: a survey
- https://doi.org/10.18653/v1/w18-2409 may be missing for title: Report of NEWS 2018 Named Entity Transliteration Shared Task
INVALID DOIs
- None
@whedon generate pdf
Attempting PDF compilation. Reticulating splines etc...
@whedon check references
Attempting to check references...
OK DOIs
- 10.1007/s12046-018-0828-8 is OK
- 10.18653/v1/w18-2409 is OK
MISSING DOIs
- None
INVALID DOIs
- None
@Kevin-Mattheus-Moerman The PDF is not showing my initials in the footer, but otherwise it looks good. I have uploaded to Zenodo and the DOI is: 10.5281/zenodo.3558365 Thanks!
@Kevin-Mattheus-Moerman sure thing, and done!
@whedon set 10.5281/zenodo.3558365 as archive
OK. 10.5281/zenodo.3558365 is the archive.
Submitting author: @seanpue (A. Sean Pue) Repository: https://github.com/seanpue/graphtransliterator Version: v1.0.4 Editor: @gkthiruvathukal Reviewer: @rlskoeser, @vc1492a Archive: 10.5281/zenodo.3558365
Status
Status badge code:
Reviewers and authors:
Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)
Reviewer instructions & questions
@rlskoeser & @vc1492a, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:
The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @gkthiruvathukal know.
✨ Please try and complete your review in the next two weeks ✨
Review checklist for @rlskoeser
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper
Review checklist for @vc1492a
Conflict of interest
Code of Conduct
General checks
Functionality
Documentation
Software paper