UBC-MDS / software-review-2022

0 stars 0 forks source link

Submission Group 13: tweepypoll (Python) #25

Open xiangwxt opened 2 years ago

xiangwxt commented 2 years ago

Submitting Author: Wenxin Xiang (@xiangwxt), Rada Rudyak (@Radascript), Linh Giang Nguyen (@gn385x) Package Name: tweepypoll
One-Line Description of Package: A python package that allow users to extract and visualize poll data! Repository Link: https://github.com/UBC-MDS/tweepypoll Version submitted: v2.0.0 Editor: TBD Reviewer 1: Luming Yang Reviewer 2: Masha Sarafrazi Reviewer 3: Julien Gordon Reviewer 4: Arlin Cherian Archive: TBD Version accepted: TBD

Description

  • tweepypoll is a Python package that allows users to extract and visualize poll data (poll questions, poll options, poll responses, etc.) from Twitter. Our goal is to make tweepypoll helpful and user-friendly; any Python beginner can effectively gain access to the data and make their own data-driven decisions.

Contains functions for all permutations of conversions between Celsius, Kelvin and Fahrenheit.

Scope

  • Please indicate which category or categories this package falls under:

    • [x] Data retrieval
    • [ ] Data extraction
    • [ ] Data munging
    • [ ] Data deposition
    • [ ] Reproducibility
    • [ ] Geospatial
    • [ ] Education
    • [x] Data visualization*
  • Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.

  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences):

    This package is designed to retrieve poll data from Twitter. The functions inside the package access and download data from online sources using API. The package visualize data retrieved from internet.

  • Who is the target audience and what are scientific applications of this package?

    The package is created for people interested in social media poll studies and social media interaction. Also it's a convenient and helpful tool for Python beginners to gain access to the poll data, and make their own data-driven decisions.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

There are existing Python packages that have similar functionality for tweets from Twitter. For example, pytweet is a package that helps extract tweets, visualize user habit on tweet posting, and apply sentiment analysis to the data. However, there are no available packages that work specifically on polls from Twitter.

  • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the tweepypoll packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • [x] does not violate the Terms of Service of any service it interacts with.
  • [x] has an OSI approved license.
  • [x] contains a README with instructions for installing the development version.
  • [x] includes documentation with examples for all functions.
  • [x] contains a vignette with examples of its essential functions and uses.
  • [x] has a test suite.
  • [x] has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

JOSS Checks

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • [x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

arlincherian commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:

~ 1 hour

Review Comments

Great job team! This seems like a very interesting package and is definitely valuable to the intended audience as you have mentioned in your documentation. I hope you have used this package to find some interesting analytics from Twitter polls. Overall you have done an excellent job of creating this package. The documentation seems to be clear and concise on the usage of the functions and I was able to follow it well.

Further comments:

  1. Few missing components on the Readme file:
    • Add a URL to the pytweet package so users can refer to the related packages you have mentioned. This is why I haven't checked off the citation requirement above.
    • The read the docs documentation link is not mentioned in the readme file. This is the reason why this requirement was also not checked off on the list.
    • You could also add code coverage, read the docs, and ci-cd badge on your readme to show that these checks are passing. The markdown code is available for all of these in their respective websites/ GitHub settings.

Minor suggestions:

  1. Functions get_polls_from_user:

    • In the example usage file you mentioned the second argument for the function defaults to 5, whereas in your code file this value is mentioned as 10.
  2. Function get_poll_id(id), visualize_poll

    • I cloned your repo and tried to run the examples in your document folder. However, I keep getting Invalid argument type: input tweet_id must be a list of numeric IDs. error when running this function. This can also be noted in the example usage page on read the docs: https://tweepypoll.readthedocs.io/en/latest/example.html . Solution: When I checked the documentation I realized the data type for ID is string type so changing get_poll_by_id(1239677278193438722) to get_poll_by_id(['1239677278193438722']) should do it!
    • Again, an exception error was raised with the third function Exception: The type of the argument 'poll_obj' mush be a dictionary. Also looks like there is a typo in the exception error "must" instead of "mush"
    • I could be missing something here, if so, please advise!
  3. Consistency with the naming of the package

    • I noticed that the package is all lower case on GitHub. However, in the contributing file, it is named with an uppercase T, Tweepypoll. I would suggest keeping the name consistently.

I am excited to see future developments on this package and looking forward to seeing the R version of tweepypoll as well. Once again, great job with this package!

mahsasarafrazi commented 2 years ago

Package Review

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best practices. In general please consider:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing:

1 hour

Review Comments

General Comments

Great job. the package is useful and can give insight. It was interesting that you put a note for This package and tell that you assume the user has a Twitter API Developer account, and the bearer token and give a link for those who are not familiar with the token. it was simply installed on my system by pip install tweepypoll without any error. I liked your rationale about the usefulness of this package and its place in the Python ecosystem as it seems like existing Python packages that perform tweets text analysis and sentiment analysis do not have poll analysis. Good Job!

Issues (documentations, functions and tests)

documentation:

  1. In readme file you can add badges like c-cd badge or license badge, and also the link to the pytweet package. It makes your readme more professional

  2. There are 9 branches in your repo, I believe as you are done with the one branch and merged it in main, just delete it.

  3. The functions are ok, but documentation is somehow confusing. in get_poll_by_id you define the parameter as a string but it is just a list. the list should be numeric, but for example, you pass sting.. you can modify it to avoid mistakes.

Tests:

  1. there should be some edge cases in your test but it is not addressed. I recommend adding more complicated tests.

Functions:

  1. Twitter bearer Token is confidential but you write it in the function and use it, I think there might be some ways that you can keep it more safe and secure just like GitHub secrets.
BooleanJulien commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:

Functionality

Final approval (post-review)

Estimated hours spent reviewing:

1 hr

Review Comments

Hi guys, great work on this package. I can really see this making it so easy to automate the process of parsing twitter data! It's very user-friendly and I enjoyed testing it with some soccer-related twitter users' polls even though I don't even use twitter.

Some constructive feedback I have:

Luming-ubc commented 2 years ago

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

Documentation

The package includes all the following forms of documentation:

Readme requirements The package meets the readme requirements below:

The README should include, from top to bottom:

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:

Functionality

For packages co-submitting to JOSS

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

Final approval (post-review)

Estimated hours spent reviewing: 1 hour


Review Comments

I enjoy this package idea, and I think it is very creative and could be of great help to visualize the poll results especially for someone who checks for poll results frequently. Here are some observations from my review and I hope these will help with further improvement of the package.

  1. In the usage section in Readme file, it is better to include real runnable examples instead of letting the user to figure out an appropriate input for the function. For example get_polls_from_user('ElonMusk') is better than get_polls_from_user('username'). However, by running get_polls_from_user('ElonMusk') as instructed in the descriptions, the returned list is empty. So, this 'ElonMusk' example might not be a good example to reflect the purpose of the function.
  2. The usage for the second function in Readme file is also somewhat confusing. Again, it could be more convenient for the user if the usage is runnable code instead of containing generalized symbol for the input variable. Also, the tweet_id was instructed as numeric type, while in reality it should be a list containing numeric type. The user would run into type error if following the instructions.
  3. Again, the example for the third function is not runnable. If it takes the output from previous function, it might be a good idea to provide the real code section containing all runnable example. The explanations for the input and output type could be well documented in the functions section in Readme or in a separate documentation file or website.
  4. I believe there is an issue remaining opened unintentionally.
  5. Although I noticed there is a example.ipynb file in the docs directory, it was not mentioned anywhere in the Readme file. So, it might not be intuitive to find. Also, I did not find any links to the vignettes. It might be helpful to include links to the documentation or examples in the Readme file.
  6. The example usage in the example.ipynb file under docs directory seems out of date. The outputs from the code checking the package version is still 0.1.0, which is a previously released version. The visualizations in the example usage is not rendered.
  7. This line of code poll = get_poll_by_id(1239677278193438722) from example.ipynb needs to be updated as poll = get_poll_by_id([1239677278193438722]). Again, the input should be a list of numeric number, not a numeric number.
  8. In the function level documentations (docstrings), there exists inconsistency. The input parameter for get_poll_by_id function is noted as str, while the example shows a list of string. get_poll_by_id(['1484375486473986049','1484375486473986049'])