postaddictme / instagram-php-scraper

Get account information, photos, videos, stories and comments.
https://packagist.org/packages/raiym/instagram-php-scraper
MIT License
3.08k stars 798 forks source link

Incorrect number of results returned for getMediasByTag(); #59

Closed boffey closed 7 years ago

boffey commented 7 years ago

I've been using your library for a while now and would first of all like to thank you for the time you've put into this.

Run into a bit of an issue today when I search for a tag with 18 items put passing 200 as the second parameter. I used to receive the 18 items, I'm now being returned 200 items with the same 18 items being duplicated multiple times.

dannyweeks commented 7 years ago

This is caused by raw response saying there is a next page causing the while loop to loop again and also the end_cursor being used as the max ID which I assume the end cursor and post id's aren't the same.

By modifying the foreach loop to check if a media has already been found it is possible to have it return the correct amount of media without duplicates; However; this doesn't solve the issue of the while loop repeating (therefore making unnecessary requests) causing a slow down.

Changes to here: https://github.com/postaddictme/instagram-php-scraper/blob/master/src/InstagramScraper/Instagram.php#L207-L208

$media = Media::fromTagPage($mediaArray);
$index++;
if (in_array($media->id, array_keys($medias))) {
    continue;
}
$medias[$media->id] = $media;

If can figure a way to fix this rather than the above hack then I will submit a PR.

raiym commented 7 years ago

@boffey and @dannyweeks thank you for raising real bug. I think there is better solution (not perfect, but will do for now). Raw response contains number of media available for this tag.

Update: not always true. Weird thing, I have opened raw link (https://www.instagram.com/explore/tags/thatbitchyoulove/?__a=1) in Safari and count:5, and in Chrome count:10, but returns only 3 medias.

I think @dannyweeks approach is right. With little editing:

                $media = Media::fromTagPage($mediaArray);
                if (in_array($media->id, $mediaIds)) {
                    return $medias;
                }
                $mediaIds[] = $media->id;
                $medias[] = $media;

Hope you don't mind (just have time to fix it)

dannyweeks commented 7 years ago

👍 great stuff thanks, it is odd how the count is inconsistent!