Information for how these tools work is provided in the corresponding section in the README.
Directory creation methods have been overhauled - the analytics directory now sorts exported files into their respective scrape types. For example, if you were to generate word frequencies for a Subreddit scrape, the JSON file will be saved in the analytics/frequencies/subreddits directory.
This release also fulfills and fixes a few enhancement requests and bug reports that are listed in the Issue Fix or Enhancement Request section.
Livestream demo GIF will be added to the demo-gifs branch shortly after this branch is merged.
Motivation/Context
I came across the stream() method for Subreddits and Redditors when I was digging through the PRAW docs a few months ago. I think it is a very cool feature and something that I should add into this project to, again, enhance its capabilities.
New Dependencies
None
Issue Fix or Enhancement Request
Fixes #29.
Bug: Beginners do not know about the PYTHONPATH.
Fix: Added an "Installation" section at the top of the README.
Also included a "Troubleshooting" section describing the ModuleNotFoundError raised if the URS directory is not on the PYTHONPATH.
Fulfills #34.
Enhancement Request: If the submissions contains a gallery, include that information in submission scrapes.
Enhancement: The submission comments scraper will pull gallery data by checking if the submission contains a gallery. If it does, the data contained in the submission's gallery_data and media_metadata fields will be written to the submission_metadata field in the exported JSON file.
Fixes a KeyError exception that is raised when trying to run analytical tools against Redditor scrapes.
Bug:moderated objects within the interactions field do not include the type field. The value assigned to type will determine which target fields to extract text and are found in all other Redditor interactions. Error handling was not implemented for moderated objects, which would cause URS to break.
Fix: Added additional error handling to account for moderated objects.
Type of Change
[x] Bug Fix (non-breaking change which fixes an issue)
[x] Code Refactor
[x] New Feature (non-breaking change which adds functionality)
[x] This change requires a documentation update
Breaking Change
N/A
List All Changes That Have Been Made
Added
User interface
Added livestream scraper flags:
-lr - livestream a Subreddit
-lu - livestream a Redditor
Added livestream scrape control flags to limit stream exclusively to submissions (default is streaming comments):
--stream-submissions
Added a flag -v/--version to display the version number.
Source code
Added a new sub-module live_scrapers within praw_scrapers for livestream functionality:
Livestream.py
utils/DisplayStream.py
utils/StreamGenerator.py
Added a new file Version.py to single-source the package version.
Added a gallery_data and media_metadata check in Comments.py, which includes the above fields if the submission contains a gallery.
README
Added a new "Installation" section with updated installation procedures.
Added a new section "Livestreaming Subreddits and Redditors" with sub-sections containing details for each flag.
Updated the Table of Contents accordingly.
Tests
Added additional unit tests for the live_scrapers module. These tests are located in tests/test_praw_scrapers/test_live_scrapers:
Overview
Summary
This is a major release.
URS v3.3.0 introduces a new suite of tools - the ability to livestream comments or submissions submitted in a Subreddit or by a Redditor.
These are the new flags for livestreaming:
Information for how these tools work is provided in the corresponding section in the
README
.Directory creation methods have been overhauled - the
analytics
directory now sorts exported files into their respective scrape types. For example, if you were to generate word frequencies for a Subreddit scrape, the JSON file will be saved in theanalytics/frequencies/subreddits
directory.This release also fulfills and fixes a few enhancement requests and bug reports that are listed in the Issue Fix or Enhancement Request section.
Livestream demo GIF will be added to the
demo-gifs
branch shortly after this branch is merged.Motivation/Context
I came across the
stream()
method for Subreddits and Redditors when I was digging through the PRAW docs a few months ago. I think it is a very cool feature and something that I should add into this project to, again, enhance its capabilities.New Dependencies
Issue Fix or Enhancement Request
PYTHONPATH
.README
.ModuleNotFoundError
raised if theURS
directory is not on thePYTHONPATH
.gallery_data
andmedia_metadata
fields will be written to thesubmission_metadata
field in the exported JSON file.KeyError
exception that is raised when trying to run analytical tools against Redditor scrapes.moderated
objects within theinteractions
field do not include thetype
field. The value assigned totype
will determine which target fields to extract text and are found in all other Redditor interactions. Error handling was not implemented formoderated
objects, which would cause URS to break.moderated
objects.Type of Change
Breaking Change
N/A
List All Changes That Have Been Made
Added
-lr
- livestream a Subreddit-lu
- livestream a Redditor--stream-submissions
-v
/--version
to display the version number.live_scrapers
withinpraw_scrapers
for livestream functionality:Livestream.py
utils/DisplayStream.py
utils/StreamGenerator.py
Version.py
to single-source the package version.gallery_data
andmedia_metadata
check inComments.py
, which includes the above fields if the submission contains a gallery.README
live_scrapers
module. These tests are located intests/test_praw_scrapers/test_live_scrapers
:tests/test_praw_scrapers/test_live_scrapers/test_Livestream.py
tests/test_praw_scrapers/test_live_scrapers/test_utils/test_DisplayStream.py
tests/test_praw_scrapers/test_live_scrapers/test_utils/test_StreamGenerator.py
The Forest.md
Changed
praw_scrapers
module:static_scrapers
sub-module:Basic.py
Comments.py
Redditor.py
Subreddit.py
confirm_options()
, previously located inSubreddit.py
toGlobal.py
.PrepRedditor.prep_redditor()
algorithm to its own class methodPrepMutts.prep_mutts()
.KeyError
exception mentioned in the Issue Fix or Enhancement Request section.init()
method from many modules - it only needs to be called once and is now located inUrs.py
.requirements.txt
.README
DirInit.py
since themake_directory()
andmake_type_directory()
methods have been deprecated.Deprecated
InitializeDirectory
class inDirInit.py
:LogMissingDir.log()
create()
make_directory()
make_type_directory()
make_analytics_directory()
create_dirs()
method.How Has This Been Tested?
pytest
locally - all tests have passed.Test Configuration
.travis.yml
for full test configuration.Dependencies
Checklist