Closed xDido closed 3 months ago
The majority of listings on Reddit are limited to 1000 items. This is a hard limitation and there isn't a way around it using Reddit's API. The number of items you're getting is likely due to some of those posts being removed by Reddit's spam systems, AutoModerator, or shadowbanned users' posts. Utilizing continue_after_id
is really only for instructing the generators for stream functions where in the listing to start yielding items from.
Re unreleased features: We're planning on making a release soon. However, those features are unlikely to resolve your issue.
Thanks for the speedy reply, truly appreciated!
I will try to search if I can get posts starting from "certain date" or "before a certain date ".
If I can't, I will try to use selenium
I would also appreciate suggestions if you have any. Thanks LilSpazJoekp,
I will try to search if I can get posts starting from "certain date" or "before a certain date ".
This isn't possible either. It used to be but was ultimately removed by Reddit.
If I can't, I will try to use selenium .
Browsers have the same limitation.
I would also appreciate suggestions if you have
If you're a moderator or researcher you can request access to pushshift. Otherwise, there isn't much else you can do unless you capture the posts as they are posted yourself.
Describe the Documentation Issue
Hello, Praw community,
I would like to thank you for your efforts made in this product.
What I'm trying to do is to scrape as much as I can from [r/Egypt] to collect some Arabic text data to create a custom Arabic dataset for a university project. when I try to scrape the subreddit top using
for submission in subreddit.new( limit=None)
it give me the same 673 posts with their respective comments then the listing generator ends.
I make a new call after 1 minute to try to fetch more posts. but I end up having the same ones.
is there a way to start scrapping from certain point in the subreddit instead of scrapping the same ones over and over.
I have seen in the unreleased version documentation that the stream_generator() function accepts a parameter called "the continue_after_id ", wondering if this might be helpful in my case, and if so how may I access this version because this feature is not available in 7.7.1.
Thanks in advance,
Attributes
Location of the issue
Unreleased, Inquiry
What did you expect to see?
Helpful advice, and explanation regarding the unreleased changelog.
What did you actually see?
unreleased changelog.
Proposed Fix
Helpful advice, and explanation regarding the unreleased changelog.
Operating System/Web Browser
Windows, Chrome
Anything else?
Thanks