pyOpenSci / software-submission

Submit your package for review by pyOpenSci here! If you have questions please post them here: https://pyopensci.discourse.group/
92 stars 36 forks source link

tweepyclean (Python) #33

Closed calsvein closed 3 years ago

calsvein commented 3 years ago

Submitting Author: Nash Makhija (@nashmakh), Matt (@MattTPin), Syad Khan (@syadk), Cal Schafer (@calsvein) Package Name: tweepyclean One-Line Description of Package: ad-on functions to the tweepy package for twitter data processing, word counts and sentiment analysis Repository Link: https://github.com/UBC-MDS/tweepyclean Version submitted: https://github.com/UBC-MDS/tweepyclean/tree/0.3 Editor: Tiffany Timbers (@ttimbers ) Reviewer 1: TBD
Reviewer 2: TBD
Archive: TBD
Version accepted: TBD


Description

tweepyclean is a Python package built to act as a processor of data generated by the existing Tweepy package that can produce clean data frames, summarize data, and generate new features.

Tweepy is a package built around Twitter's API and is used to scrape tweet information from their servers.

Our package creates functions to process the raw data from Tweepy into a more understandable format by extracting and organizing the contents of tweets for a user. tweepyclean is specifically built to be used in analysis of a specific user's timeline (generated using tweepy's api.user_timeline function). Users can visualize average engagement based on time of day posted, see basic summary statistics of word contents and sentiment analysis of tweets and have a processed dataset for usage in machine learning models.

Scope

* Please fill out a pre-submission inquiry before submitting a data visualization package. For more info, see notes on categories of our guidebook.

The tweepy package extracts tweet data, but it is not in a format that it is ready for analysis. Tweepyclean performs functions to convert data into a clean dataframe, performs feature engineering, and creates summary statistics and basic visualizations.

The audience is strictly intended for those who are already using the tweepy package and have a Twitter API key.

Not that we are aware of.

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

Publication options

JOSS Checks - [ ] The package has an **obvious research application** according to JOSS's definition in their [submission requirements][JossSubmissionRequirements]. Be aware that completing the pyOpenSci review process **does not** guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS. - [ ] The package is not a "minor utility" as defined by JOSS's [submission requirements][JossSubmissionRequirements]: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria. - [ ] The package contains a `paper.md` matching [JOSS's requirements][JossPaperRequirements] with a high-level description in the package root or in `inst/`. - [ ] The package is deposited in a long-term repository with the DOI: *Note: Do not submit your package separately to JOSS*

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

Code of conduct

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

Editor and review templates can be found here

calsvein commented 3 years ago

Please disregard this submission