openjournals / joss-reviews

Reviews for the Journal of Open Source Software
Creative Commons Zero v1.0 Universal
720 stars 38 forks source link

[REVIEW]: Graph Transliterator: a graph-based transliteration tool #1717

Closed whedon closed 4 years ago

whedon commented 5 years ago

Submitting author: @seanpue (A. Sean Pue) Repository: https://github.com/seanpue/graphtransliterator Version: v1.0.4 Editor: @gkthiruvathukal Reviewer: @rlskoeser, @vc1492a Archive: 10.5281/zenodo.3558365

Status

status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca"><img src="https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca/status.svg)](https://joss.theoj.org/papers/a9ae921a0402eadf367b888c96d6cfca)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@rlskoeser & @vc1492a, please carry out your review in this issue by updating the checklist below. If you cannot edit the checklist please:

  1. Make sure you're logged in to your GitHub account
  2. Be sure to accept the invite at this URL: https://github.com/openjournals/joss-reviews/invitations

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @gkthiruvathukal know.

Please try and complete your review in the next two weeks

Review checklist for @rlskoeser

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

Review checklist for @vc1492a

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

whedon commented 5 years ago

Hello human, I'm @whedon, a robot that can help you with some common editorial tasks. @rlskoeser, @vc1492a it looks like you're currently assigned to review this paper :tada:.

:star: Important :star:

If you haven't already, you should seriously consider unsubscribing from GitHub notifications for this (https://github.com/openjournals/joss-reviews) repository. As a reviewer, you're probably currently watching this repository which means for GitHub's default behaviour you will receive notifications (emails) for all reviews 😿

To fix this do the following two things:

  1. Set yourself as 'Not watching' https://github.com/openjournals/joss-reviews:

watching

  1. You may also like to change your default settings for this watching repositories in your GitHub profile here: https://github.com/settings/notifications

notifications

For a list of things I can do to help you, just type:

@whedon commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@whedon generate pdf
whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

vc1492a commented 5 years ago

Overview: Of the JOSS reviews I have been a part of, this project has (to date) been the most well-documented and polished in terms of code readability and test coverage which is great to see. I think the author(s) did a good job on the software and in providing examples and documentation in order to get started easily. The software is easy to use and I have even shared the software with some of my colleagues who will be exploring its usability in some of our team's projects - I think it could have some utility! Very much appreciated the thorough documentation in the code and tests written to verify functionality as well as with how to contribute, and this will go a long way towards communicating the work but also inviting open-source contributions. I have included some commentary on checkboxes below.

Installation and Installation Instructions: The project documentation provide instructions for installation via PyPi or by cloning the Github repository. After examining both setup.py, requirements.txt, and the source code from the project repository and attempting an installation in my environment, the installation completed successfully however when using import graphtransliterator the import failed due to marshmallow being a undefined import. It is one of the project's dependencies and is noted in requirements.txt, but it was not installed during installation as it was missing from a list of requirements in setup.py. I have corrected this minor error and have provided the fix in this pull request. It may help to also have the project's dependencies noted somewhere in the documentation beyond requirements.txt in the project repository, but I don't think it is a high priority. It may also be nice to list which specific versions of Python 3 have support and are tested for in readme.md and/or in the documentation - in this case, Python versions 3.5 through 3.7. The version numbers are listed in the contribution guidelines, but that's the only location I see the Python versions mentioned outside of configuration files.

Functionality: I have verified the functionality presented through the examples presented in the documentation, and the software's functionality operates as the author describes. The ambiguity checking is a particularly nice feature which could perhaps be used as a verification check if automatically generated transliteration rules. I also liked how the author provided a means to expose the underlying graph of the transliteration rules to the user for examination - whether through visualization our outside analyses. I currently do not have my own use case in which to test the software outside of the examples presented in the documentation - the editor is free to determine whether a more in-depth look at the functionality would be beneficial as part of the review.

Automated Tests: I was able to run the automated tests easily, and the tests are well documented which allowed me to quickly understand their purpose and which functionality in the software they are designed to test. I did notice a bit of unused tests which I removed in the aforementioned pull request just to clean things up a little bit.

References: The author did a good job of pointing out related software with citation and noting the differences between GraphTransliterator and those software. I did however expect some citations for the following sentences:

It [transliteration] enables the standardized organization and search of resources, as in library
systems. It also permits the encoding of additional information, which enables disambiguation
and advanced linguistic analysis, including natural language processing tasks, that are often
not possible in the original script.

A reader which is new to the field may want to explore works in which transliteration is used in these areas - citations would provide this background and ground the paper/software under review with some real-world applications. Providing some references here would provide a more complete paper around the software and its use cases.

seanpue commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

seanpue commented 5 years ago

Thanks @vc1492a for your generous and helpful review!

I have made the following changes to the oode and documentation:

I have also added references and made changes to the first paragraph of the paper following your suggestion to better elaborate the use cases of transliteration.

I hope that resolves the issues. Thanks so much. (P.S. Would love to hear later about the potential usages!)

rlskoeser commented 5 years ago

I have reviewed the software, documentation, and paper. The only checklist item I'm not able to sign off on is a statement of need included with the project documentation. As I understand it, a statement of need should be included in the readme or project documentation, but the only place I could find this was in the article. I suggest adapting the summary information from the beginning of the article.

Other things I noted:

rlskoeser commented 5 years ago

Forgot to say: this is a very well-documented and interesting tool! I haven't had the chance to work much with languages that require transliteration, but we're starting to work on a project involving Ethiopian Ge`ez. I haven't learned much about the language yet, but will keep this in mind if a use case comes up. It's important to put a statement of need on the readme so that people like me, who don't usually work with languages that require transliteration, will be able to understand immediately the use and power of your tool.

seanpue commented 5 years ago

Thank you for these excellent corrections and comments, @rlskoeser! I will work on fixing them over the weekend and will let you know when I'm done.

Do let me know how it goes with Ethiopian Ge`ez. I am working on developing a standard for Urdu and some other South Asian languages that will generate the Perso-Arabic text, with or without the Arabic diacritics, and also output in the scholarly transliteration that we use in academic writing as well as in library of congress romanization format, which I need for an archive project.

I also want to figure out a good way to bundle YAML files to the module, too, so that people can contribute them. I might add a transliterators module that reads from a directory of YAML files (or JSON) and can produce an accessible list and GraphTransliterator objects, maybe by setting __all__ programmatically in __init__.py. That will require setting some standards for the metadata field. I may model it on the fields in setup.py. I am not sure if there is any standard for that sort of thing. If you have any thoughts about that, please let me know. Thanks!

rlskoeser commented 5 years ago

Bundled transliterator configurations sounds valuable. Maybe you could define a subclass of the GraphTransliterator object that looks for a named config file in a specified path (or possibly multiple paths).

If/when you add that, you'll probably also want to update your contributing document to explain how to add a new transliterator configuration. You'll also want think about what kind of testing a bundled transliterator needs (probably at least as some sanity checking?).

seanpue commented 5 years ago

I'm close to finishing these corrections. It will probably take at least another week.

I have added bundled transliterators requiring tests that cover all nodes and edges of graphs, as well as OnMatchRules, if applicable. I will have to update the essay, as well.

I will also be cleaning up the documentation and other points following the excellent recommendations of @rlskoeser.

This review process has been so helpful! Thanks, @vc1492a and @rlskoeser!

gkthiruvathukal commented 5 years ago

@seanpue: Thanks for your follow up and willingness to address all review feedback.

@vc1492a and @rlskoeser: Thanks for your thorough reviews. This feels like a model of how reviews should be done. Even though the work is clearly well-done and polished, there were minor issues raised to ensure the submission leads to a great outcome.

I feel optimistic this is heading toward acceptance and look forward to recommending same when @seanpue finishes making the minor revisions. Separately, I rarely jump in with any feedback before the reviewers do, but I felt pretty optimistic about this submission when I first saw it. It's nice to see this research area represented within JOSS.

kthyng commented 5 years ago

Hi @seanpue how are your corrections coming along?

seanpue commented 5 years ago

Hi @kthyng Almost done! Hopefully today.

seanpue commented 5 years ago

@whedon generate pdf

whedon commented 5 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 5 years ago

:point_right: Check article proof :page_facing_up: :point_left:

seanpue commented 5 years ago

Hi @vc1492a @rlskoeser @gkthiruvathukal

Ok, I think it's ready to go. I have added the statement of need, updated the article, and also added bundled transliterators, a tutorial on adding bundled transliterators, and a command line interface. I have used jupyter-sphinx so that the docs automatically generate the sample code, and the code can now be cut and pasted or downloaded.

ooo[bot] commented 5 years ago

:wave: Hey @seanpue...

Letting you know, @gkthiruvathukal is currently OOO until Sunday, October 27th 2019. :heart:

seanpue commented 4 years ago

Hi @vc1492a @rlskoeser @gkthiruvathukal Wanted to check to see if you've had a chance to review the updates. Thanks.

gkthiruvathukal commented 4 years ago

@seanpue It looks like we are nearing a conclusion here! It would be great if @vc1492a and @rlskoeser can chime in and take a final look. If I don't hear any new issues, say, by Monday, I think we are ready to move toward acceptance.

vc1492a commented 4 years ago

@seanpue must have missed the notification previously, apologies! I took a look at the repository and the updates look good to me. I feel that this work is ready for publication in JOSS.

seanpue commented 4 years ago

Thanks again @vc1492a @rlskoeser Do you think you'll get a chance to review the changes? I believe I have addressed your concerns.

vc1492a commented 4 years ago

@seanpue See the above comment, I reviewed the changes 13 days ago and they look good to me.

seanpue commented 4 years ago

Saw that but forgot the !. Thanks @vc1492a!

rlskoeser commented 4 years ago

I apologize for the delay. Changes look good to me.

seanpue commented 4 years ago

Hi @gkthiruvathukal I think this is all set!

gkthiruvathukal commented 4 years ago

@seanpue Thanks so much to you and our wonderful reviewers @rlskoeser and @vc1492a.

@openjournals/joss-eics I am ready to move on accepting this contribution!

Kevin-Mattheus-Moerman commented 4 years ago

@vc1492a it looks like you are happy with this submission, can you tick those last boxes?

Kevin-Mattheus-Moerman commented 4 years ago

@whedon check references

whedon commented 4 years ago
Attempting to check references...
whedon commented 4 years ago

OK DOIs

- None

MISSING DOIs

- https://doi.org/10.1007/s12046-018-0828-8 may be missing for title: Machine transliteration and transliterated text retrieval: a survey
- https://doi.org/10.18653/v1/w18-2409 may be missing for title: Report of NEWS 2018 Named Entity Transliteration Shared Task

INVALID DOIs

- None
Kevin-Mattheus-Moerman commented 4 years ago

@seanpue can you check those DOI's :point_up:

Kevin-Mattheus-Moerman commented 4 years ago

@seanpue this work is about to be processed for acceptance. Can you give the paper a final check yourself as well (in particular the author names and affiliations)?

Kevin-Mattheus-Moerman commented 4 years ago

@seanpue once you fixed/checked the above can you post an archived version of the software on Zenodo and report back the DOI of the archived version?

seanpue commented 4 years ago

@whedon check references

whedon commented 4 years ago
Attempting to check references...
whedon commented 4 years ago

OK DOIs

- None

MISSING DOIs

- https://doi.org/10.1007/s12046-018-0828-8 may be missing for title: Machine transliteration and transliterated text retrieval: a survey
- https://doi.org/10.18653/v1/w18-2409 may be missing for title: Report of NEWS 2018 Named Entity Transliteration Shared Task

INVALID DOIs

- None
seanpue commented 4 years ago

@whedon generate pdf

whedon commented 4 years ago
Attempting PDF compilation. Reticulating splines etc...
whedon commented 4 years ago

:point_right: Check article proof :page_facing_up: :point_left:

seanpue commented 4 years ago

@whedon check references

whedon commented 4 years ago
Attempting to check references...
whedon commented 4 years ago

OK DOIs

- 10.1007/s12046-018-0828-8 is OK
- 10.18653/v1/w18-2409 is OK

MISSING DOIs

- None

INVALID DOIs

- None
seanpue commented 4 years ago

@Kevin-Mattheus-Moerman The PDF is not showing my initials in the footer, but otherwise it looks good. I have uploaded to Zenodo and the DOI is: 10.5281/zenodo.3558365 Thanks!

vc1492a commented 4 years ago

@Kevin-Mattheus-Moerman sure thing, and done!

Kevin-Mattheus-Moerman commented 4 years ago

@whedon set 10.5281/zenodo.3558365 as archive

whedon commented 4 years ago

OK. 10.5281/zenodo.3558365 is the archive.