Submission Group 13: tweepypoll (Python)

Submitting Author: Wenxin Xiang (@xiangwxt), Rada Rudyak (@Radascript), Linh Giang Nguyen (@gn385x) Package Name: tweepypoll
One-Line Description of Package: A python package that allow users to extract and visualize poll data! Repository Link: https://github.com/UBC-MDS/tweepypoll Version submitted: v2.0.0 Editor: TBD Reviewer 1: Luming Yang Reviewer 2: Masha Sarafrazi Reviewer 3: Julien Gordon Reviewer 4: Arlin Cherian Archive: TBD Version accepted: TBD

Description

tweepypoll is a Python package that allows users to extract and visualize poll data (poll questions, poll options, poll responses, etc.) from Twitter. Our goal is to make tweepypoll helpful and user-friendly; any Python beginner can effectively gain access to the data and make their own data-driven decisions.

Contains functions for all permutations of conversions between Celsius, Kelvin and Fahrenheit.

Scope

Please indicate which category or categories this package falls under:

[x] Data retrieval

[ ] Data extraction

[ ] Data munging

[ ] Data deposition

[ ] Reproducibility

[ ] Geospatial

[ ] Education

[x] Data visualization*

Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.

Explain how the and why the package falls under these categories (briefly, 1-2 sentences):

This package is designed to retrieve poll data from Twitter. The functions inside the package access and download data from online sources using API. The package visualize data retrieved from internet.

Who is the target audience and what are scientific applications of this package?

The package is created for people interested in social media poll studies and social media interaction. Also it's a convenient and helpful tool for Python beginners to gain access to the poll data, and make their own data-driven decisions.

Are there other Python packages that accomplish the same thing? If so, how does yours differ?

There are existing Python packages that have similar functionality for tweets from Twitter. For example, pytweet is a package that helps extract tweets, visualize user habit on tweet posting, and apply sentiment analysis to the data. However, there are no available packages that work specifically on polls from Twitter.

If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the tweepypoll packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

[x] does not violate the Terms of Service of any service it interacts with.

[x] has an OSI approved license.

[x] contains a README with instructions for installing the development version.

[x] includes documentation with examples for all functions.

[x] contains a vignette with examples of its essential functions and uses.

[x] has a test suite.

[x] has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.

Publication options

[ ] Do you wish to automatically submit to the Journal of Open Source Software? If so:

JOSS Checks

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

[x] Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Code of conduct

[x] I agree to abide by pyOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of the package and any non-standard dependencies in README
[ ] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all user-facing functions
[x] Examples for all user-facing functions
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[ ] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
[x] Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
[x] Installation instructions
[x] Any additional setup required (authentication tokens, etc)
[x] Brief demonstration usage
[ ] Direction to more detailed documentation (e.g. your documentation files or website).
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages
[ ] Citation information

Usability

Reviewers are encouraged to submit suggestions (or pull requests) that will improve the usability of the package as a whole. Package structure should follow general community best-practices. In general please consider:

[x] The documentation is easy to find and understand
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software
[ ] Authors: A list of authors with their affiliations
[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
[ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

~ 1 hour

Review Comments

Great job team! This seems like a very interesting package and is definitely valuable to the intended audience as you have mentioned in your documentation. I hope you have used this package to find some interesting analytics from Twitter polls. Overall you have done an excellent job of creating this package. The documentation seems to be clear and concise on the usage of the functions and I was able to follow it well.

Further comments:

Few missing components on the Readme file:
- Add a URL to the pytweet package so users can refer to the related packages you have mentioned. This is why I haven't checked off the citation requirement above.
- The read the docs documentation link is not mentioned in the readme file. This is the reason why this requirement was also not checked off on the list.
- You could also add code coverage, read the docs, and ci-cd badge on your readme to show that these checks are passing. The markdown code is available for all of these in their respective websites/ GitHub settings.

Minor suggestions:

A short-form “dicts” is used to refer to poll_obj, but maybe it could be written out.
Missing a period after the word ‘respectively’ under the usage section.
Consistent use of formatting when mentioning the package functions get_poll_by_id instead of get_poll_by_id()

Functions get_polls_from_user:
- In the example usage file you mentioned the second argument for the function defaults to 5, whereas in your code file this value is mentioned as 10.
Function get_poll_id(id), visualize_poll
- I cloned your repo and tried to run the examples in your document folder. However, I keep getting Invalid argument type: input tweet_id must be a list of numeric IDs. error when running this function. This can also be noted in the example usage page on read the docs: https://tweepypoll.readthedocs.io/en/latest/example.html . Solution: When I checked the documentation I realized the data type for ID is string type so changing get_poll_by_id(1239677278193438722) to get_poll_by_id(['1239677278193438722']) should do it!
- Again, an exception error was raised with the third function Exception: The type of the argument 'poll_obj' mush be a dictionary. Also looks like there is a typo in the exception error "must" instead of "mush"
- I could be missing something here, if so, please advise!
Consistency with the naming of the package
- I noticed that the package is all lower case on GitHub. However, in the contributing file, it is named with an uppercase T, Tweepypoll. I would suggest keeping the name consistently.

I am excited to see future developments on this package and looking forward to seeing the R version of tweepypoll as well. Once again, great job with this package!

Package Review

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[ ] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all user-facing functions
[x] Examples for all user-facing functions
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[ ] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
[x] Short description of goals of the package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
[x] Installation instructions
[x] Any additional setup required (authentication tokens, etc)
[x] Brief demonstration usage
[ ] Direction to more detailed documentation (e.g. your documentation files or website).
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages
[ ] Citation information

Usability

[x] The documentation is easy to find and understand
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software
[ ] Authors: A list of authors with their affiliations
[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
[ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

1 hour

Review Comments

General Comments

Great job. the package is useful and can give insight. It was interesting that you put a note for This package and tell that you assume the user has a Twitter API Developer account, and the bearer token and give a link for those who are not familiar with the token. it was simply installed on my system by pip install tweepypoll without any error. I liked your rationale about the usefulness of this package and its place in the Python ecosystem as it seems like existing Python packages that perform tweets text analysis and sentiment analysis do not have poll analysis. Good Job!

Issues (documentations, functions and tests)

documentation:

In readme file you can add badges like c-cd badge or license badge, and also the link to the pytweet package. It makes your readme more professional
There are 9 branches in your repo, I believe as you are done with the one branch and merged it in main, just delete it.
The functions are ok, but documentation is somehow confusing. in get_poll_by_id you define the parameter as a string but it is just a list. the list should be numeric, but for example, you pass sting.. you can modify it to avoid mistakes.

Tests:

there should be some edge cases in your test but it is not addressed. I recommend adding more complicated tests.

Functions:

Twitter bearer Token is confidential but you write it in the function and use it, I think there might be some ways that you can keep it more safe and secure just like GitHub secrets.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all user-facing functions
[x] Examples for all user-facing functions
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[ ] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
[ ] Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
[x] Installation instructions
[x] Any additional setup required (authentication tokens, etc)
[x] Brief demonstration usage
[ ] Direction to more detailed documentation (e.g. your documentation files or website).
[x] If applicable, how the package compares to other similar packages and/or how it relates to other packages
[ ] Citation information

Usability

[x] The documentation is easy to find and understand
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

Final approval (post-review)

[ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing:

1 hr

Review Comments

Hi guys, great work on this package. I can really see this making it so easy to automate the process of parsing twitter data! It's very user-friendly and I enjoyed testing it with some soccer-related twitter users' polls even though I don't even use twitter.

Some constructive feedback I have:

The statement of need could go bit further in highlighting the value of the package. I would include the fact that you can increase the reproducibility of social media analytics and increase the efficiency of batch processing twitter polls.
(Previously mentioned by other reviewers) You should probably include a link to your readthedocs in the readme to highlight your vignette https://tweepypoll.readthedocs.io/en/latest/
The branches of your project could use some naming conventions for clarity and ease of use if someone new wants to contribute i.e. starting them with "feature/" or "bug-fix/"
(Previously mentioned by other reviewers) Your documentation contains an error, this should be updated.
(previously mentioned by other reviewers) Many of your functions are expecting lists as inputs but the documentation and use examples are providing ints.
It would be nice if there was some functionality to modify the string in the title of the poll to insert '\n' or a similar method to include linebreaks at certain character limits so that the title autowraps instead of extending to the right. Even in the documentation example the title is awkwardly long.

Package Review

Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide

[x] As the reviewer I confirm that there are no conflicts of interest for me to review this work (If you are unsure whether you are in conflict, please speak to your editor before starting your review).

Documentation

The package includes all the following forms of documentation:

[x] A statement of need clearly stating problems the software is designed to solve and its target audience in README
[x] Installation instructions: for the development version of package and any non-standard dependencies in README
[x] Vignette(s) demonstrating major functionality that runs successfully locally
[x] Function Documentation: for all user-facing functions
[x] Examples for all user-facing functions
[x] Community guidelines including contribution guidelines in the README or CONTRIBUTING.
[x] Metadata including author(s), author e-mail(s), a url, and any other relevant metadata e.g., in a setup.py file or elsewhere.

Readme requirements The package meets the readme requirements below:

[x] Package has a README.md file in the root directory.

The README should include, from top to bottom:

[x] The package name
[ ] Badges for continuous integration and test coverage, a repostatus.org badge, and any other badges. If the README has many more badges, you might want to consider using a table for badges: see this example. Such a table should be more wide than high. (Note that the badge for pyOpenSci peer-review will be provided upon acceptance.)
[ ] Short description of goals of package, with descriptive links to all vignettes (rendered, i.e. readable, cf the documentation website section) unless the package is small and there’s only one vignette repeating the README.
[x] Installation instructions
[ ] Any additional setup required (authentication tokens, etc)
[x] Brief demonstration usage
[ ] Direction to more detailed documentation (e.g. your documentation files or website).
[ ] If applicable, how the package compares to other similar packages and/or how it relates to other packages
[x] Citation information

Usability

[x] The documentation is easy to find and understand
[x] The need for the package is clear
[x] All functions have documentation and associated examples for use

Functionality

[x] Installation: Installation succeeds as documented.
[x] Functionality: Any functional claims of the software been confirmed.
[x] Performance: Any performance claims of the software been confirmed.
[x] Automated tests: Tests cover essential functions of the package and a reasonable range of inputs and conditions. All tests pass on the local machine.
[x] Continuous Integration: Has continuous integration, such as Travis CI, AppVeyor, CircleCI, and/or others.
[x] Packaging guidelines: The package conforms to the pyOpenSci packaging guidelines.

For packages co-submitting to JOSS

[ ] The package has an obvious research application according to JOSS's definition in their submission requirements.

Note: Be sure to check this carefully, as JOSS's submission requirements and scope differ from pyOpenSci's in terms of what types of packages are accepted.

The package contains a paper.md matching JOSS's requirements with:

[ ] A short summary describing the high-level functionality of the software
[ ] Authors: A list of authors with their affiliations
[ ] A statement of need clearly stating problems the software is designed to solve and its target audience.
[ ] References: with DOIs for all those that have one (e.g. papers, datasets, software).

Final approval (post-review)

[ ] The author has responded to my review and made changes to my satisfaction. I recommend approving this package.

Estimated hours spent reviewing: 1 hour

Review Comments

I enjoy this package idea, and I think it is very creative and could be of great help to visualize the poll results especially for someone who checks for poll results frequently. Here are some observations from my review and I hope these will help with further improvement of the package.

In the usage section in Readme file, it is better to include real runnable examples instead of letting the user to figure out an appropriate input for the function. For example get_polls_from_user('ElonMusk') is better than get_polls_from_user('username'). However, by running get_polls_from_user('ElonMusk') as instructed in the descriptions, the returned list is empty. So, this 'ElonMusk' example might not be a good example to reflect the purpose of the function.
The usage for the second function in Readme file is also somewhat confusing. Again, it could be more convenient for the user if the usage is runnable code instead of containing generalized symbol for the input variable. Also, the tweet_id was instructed as numeric type, while in reality it should be a list containing numeric type. The user would run into type error if following the instructions.
Again, the example for the third function is not runnable. If it takes the output from previous function, it might be a good idea to provide the real code section containing all runnable example. The explanations for the input and output type could be well documented in the functions section in Readme or in a separate documentation file or website.
I believe there is an issue remaining opened unintentionally.
Although I noticed there is a example.ipynb file in the docs directory, it was not mentioned anywhere in the Readme file. So, it might not be intuitive to find. Also, I did not find any links to the vignettes. It might be helpful to include links to the documentation or examples in the Readme file.
The example usage in the example.ipynb file under docs directory seems out of date. The outputs from the code checking the package version is still 0.1.0, which is a previously released version. The visualizations in the example usage is not rendered.
This line of code poll = get_poll_by_id(1239677278193438722) from example.ipynb needs to be updated as poll = get_poll_by_id([1239677278193438722]). Again, the input should be a list of numeric number, not a numeric number.
In the function level documentations (docstrings), there exists inconsistency. The input parameter for get_poll_by_id function is noted as str, while the example shows a list of string. get_poll_by_id(['1484375486473986049','1484375486473986049'])

UBC-MDS / software-review-2022