mattpodolak / pmaw

A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
MIT License
212 stars 28 forks source link

Unable to fetch comments by ID #41

Open apoorva1225 opened 2 years ago

apoorva1225 commented 2 years ago

comment_id is list of string

comments_arr = pushshift_api.search_comments(ids = comment_ids)

output Error: INFO:pmaw.PushshiftAPIBase:Total:: Success Rate: 87.50% - Requests: 16 - Batches: 4 - Items Remaining: 6720 /usr/local/lib/python3.7/dist-packages/pmaw/Request.py:230: UserWarning: 6720 items were not found in Pushshift f'{self.limit} items were not found in Pushshift')

Question: Is it true that pushshift has a delay in fetching data? Can we fetch data for current date ?

mattpodolak commented 2 years ago

Hi @apoorva1225 were you running this query yesterday? There was a Pushshift outage that may have impacted your results.

Also, the pushshift metadata is delayed as it takes around 24hrs for comments and scores to reach a steady state, so it doesn't update the metadata until after that window and rarely beyond that.

Check out the PRAW enrichment feature, this will replace metadata from Pushshift with the latest data from Reddit

apoorva1225 commented 2 years ago

Hi @mattpodolak, this query is not work. It continues to give me the same error -----/usr/local/lib/python3.7/dist-packages/pmaw/Request.py:230: UserWarning: 45 items were not found in Pushshift f'{self.limit} items were not found in Pushshift')

Could you please suggest a fix for this?

dunovank commented 2 years ago

I'm having the same issue. Is there a status update on this?