Closed pjachim closed 5 days ago
I think you're right. It seems like GETTR changed the way they handle post IDs a few months/weeks ago, which means that we can't sequentially iterate through posts anymore. Not sure what the best way to get around this is.
Don't see an obvious fix on our end.
When I try to use the all API with posts greater than
p7b5gh
, I start running into issues where I think that there are large numbers of indices seem to be missing.e.g., running the following command:
gogettr all --max 1000 --first p7b5gh
Returns a single post.
I tried with a couple of much larger ids (copied from another issue), I got a similar result. I did the same thing with the module mode, and still no luck. I want to be respectful of their API and don't want to like brute force until I see more posts, but I am not sure how else to collect sets of posts for a given time period or like the next n posts after a specific
_id
.Before that index, I don't seem to run in quite as many issues, though there are definitely gaps in the returned indices.
Do you have any recommendations for using all with larger indices, or should I switch to scraping posts for specific users, rather than specific points in time? Am I missing something? Do the indices change to a different base or something? Is this just a weird coincidence that I am reading into too much?
Thank you for taking a look, this tool is super helpful!