ScriptSmith / instamancer

Scrape Instagram's API with Puppeteer
http://adamsm.com/instamancer
MIT License
398 stars 61 forks source link

Fixes getting initial photos of a user, which has less then 10 photos #33

Closed goooseman closed 5 years ago

goooseman commented 5 years ago

I've found a bug.

Steps to reproduce

  1. Find an Account in Instagram with less then 10 photos
  2. Create a user api with this account
  3. Start to iterate the generator

Expected

  1. Generator to return all photos and return { done: true } at the last one, so iterator stops.

Actual

  1. Generator does returns { done: false } at the last photo, so iterator is frozen and never stops.

Solution

I've reproduced the error using tests and fixed it by setting isFinished property in Instagram instance, if all photos are already processed.

ScriptSmith commented 5 years ago

That change causes all endpoints to only gather the preloaded posts because this.finished will be set to true when shortCodes is exhausted.

I think the way to fix this would be to change the if statement here: https://github.com/ScriptSmith/instamancer/blob/0073e813783c41494875d926daa517f2e440d025/src/api/instagram.ts#L612-L624

So that the this.index === 0 condition is only met when not scraping the preloaded posts. Perhaps another private field that's set when scrapeDefaultPosts is running and unset when the first response comes through from the api.

ScriptSmith commented 5 years ago

This causes all endpoints to close when they've jumped failedJumps times, because it's setting finished to true.

goooseman commented 5 years ago

Thanks for the help!

I've run the tests before pushing my commit, but the tests' limits are 10 times lower when being run on the local computer, not CI server, so I did not see them failing.

ScriptSmith commented 5 years ago

Ah, yep. I did that to speed things up locally but obviously it can cause issues when testing large scraping jobs. You can set TRAVIS=1 in the future if needed.

goooseman commented 5 years ago

So the tests are passing and the problem is fixed.

Any chance it will be merged and released today?

ScriptSmith commented 5 years ago

Sure

goooseman commented 5 years ago

I see a commit about version bump in ths repo, but there is no new version in npm.