Open erictsai1208 opened 1 year ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
An amazing package guys! Great job. Truly simplifies the functionality of assessing text sentiments and is something we can actually use in practice! I do have some comments that I hope will improve the package even further and make it more robust.
nltk
is a very active repository, they are constantly updating their machine learning models and their codebase, hence you might want to replace the function test_aggregated_sentiment_score_output_value
. They may update the model used by SentimentIntensityAnalyzer
in the future and this might return a different sentimental score for the test string used by the function, which will break your test cases.generate_wordcloud
function that specifies for which sentiment to generate a wordcloud (positive, negative or neutral). This will increase the speed of execution of the function and return only the necessary wordclouds.sentiment score
actually means in the aggregate_sentiment_score
function. You can specify the range of the values that one can expect as a response from the function and how to interpret those values qualitatively.pyplot
object in the sentiment_score_plot
function, you can return a PIL object. This is a better practice as it can't be overwritten by the pyplot
object that might already exist in the environment the user is working in.__init__.py
file in the package that imports all the important functions. This will ensure more intuitive access to the package's functionalities while hiding the helper functions from direct usage. Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing: 1.5
Overall, the package is very well-thought and put together nicely. Very good job! I like the idea of incorporating visualization into sentiment analysis, which gives people a more straightforward way of understanding their analysis output.
In the documentation, it might be worth explaining what the "sentiment score" actually means in the aggregate_sentiment_score
function, i.e., what their scale/range is, and what positive/negative results mean. For people who do not have much knowledge in sentiment analysis, these numerical results given with no detailed description might potentially be a little bit confusing. Likewise, it might also be a good idea to describe what the likert scale of the sentiment score means in the convert_to_likert
function, i.e., "1 means strongly disagree and 5 means strongly agree".
Since the input for all the functions includes a dataframe and a column, it might also be worth explaining in the README/usage how to preprocess the dataframe (for example, the input column used in the function can only be strings) to the users.
With 3 said, a following suggestion is that more tests can be added to improve the 78% code coverage. For example, in the aggregate_sentiment_score
function, a test can be added to check the datatype of the input columns.
It might be a good idea to clarify if the sentiment analyzer is able to analyze any languages or just English. I saw you used SentimentIntensityAnalyzer()
from nltk
in the helper function, so it might be a good idea to clarify what languages their function can accept. I mentioned this only because I saw some Korean characters in your raw input text, so I was wondering if those characters were analyzed too or if they were just ignored.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
1 Hour
generate_wordcloud
function is amazing! However right now, when I run the function, it itself does not give any output. To actually get the word cloud, I have to write another piece of code to index the list generated by the function. It may be useful to generate an output using a print() statetment and tell them that to actully get the word cloud they need to run another piece of code.generate_wordcloud
, the user knows for which sentiment the word cloud has been generated (positive, negative or neutral). Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
pyproject.toml
file or elsewhere.Readme file requirements The package meets the readme requirements below:
The README should include, from top to bottom:
NOTE: If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the a badge for pyOpenSci peer-review will be provided upon acceptance.)
Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider whether:
Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.
The package contains a paper.md
matching JOSS's requirements with:
Estimated hours spent reviewing:
Submitting Author: Eric Tsai (@erictsai1208), Ranjitprakash Sundaramurthi (@ranjitprakash1986), Ziyi Chen (@zchen156), Tanmay Agarwal (@tanmayag97) Package Name: pysentimentanalyzer One-Line Description of Package: Perform sentiment analysis on the given texts and summarizes information from the text. Repository Link: https://github.com/UBC-MDS/py-sentimentanalyzer Version submitted: v0.2.3 Editor: TBD
Reviewer 1: Roan Raina Reviewer 2: Jenit Jain
Reviewer 3: Manvir Kohli Reviewer 4: Crystal Geng Archive: TBD
Version accepted: TBD Date accepted (month/day/year): TBD
Description
Scope
For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):
Who is the target audience and what are scientific applications of this package?
People analyzing feedback and survey results, such as PR teams or management teams.
Are there other Python packages that accomplish the same thing? If so, how does yours differ? There exists other packages that perform sentiment analysis, but there are none like ours that summarize the results and present it in a simple manner through visualizations.
If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted:Technical checks
For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:
Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?
This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.
Code of conduct
Editor and Review Templates
The editor template can be found here.
The review template can be found here.