mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
11.5k stars 942 forks source link

[Instagram] Avoid scrape warning #4907

Open ashes-xda opened 9 months ago

ashes-xda commented 9 months ago
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/feed/user/54655984401/'
[instagram][info] Use '-o cursor=2975840913239852315_54655984401' to continue downloading from the current position
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/feed/reels_media/'
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/highlights/5469402209/highlights_tray/'
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/clips/user/'
[6/6] https://www.instagram.com/xyz/
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/users/web_profile_info/'
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/users/web_profile_info/'
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/users/web_profile_info/'
[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/users/web_profile_info/'

Getting scrape warning too often these days on IG saying that they might permanently terminate my account (secondary account just for scrapping purpose) Is there anything to bypass this warning or something else that i can do to avoid this ?

chocoagua commented 9 months ago

You could try setting the sleep and sleep-request options to something like 12 seconds and 2 seconds respectively.

ashes-xda commented 9 months ago

You could try setting the sleep and sleep-request options to something like 12 seconds and 2 seconds respectively.

What happens if I choose not to use cookies? Will the download include all posts and reels but exclude stories or highlights? Or are there any limitations for downloading without cookies, such as potential lower resolution or incomplete downloads, even if the profile is public? @chocoagua @mikf

danrynr commented 9 months ago

You could try setting the sleep and sleep-request options to something like 12 seconds and 2 seconds respectively.

What happens if I choose not to use cookies? Will the download include all posts and reels but exclude stories or highlights? Or are there any limitations for downloading without cookies, such as potential lower resolution or incomplete downloads, even if the profile is public? @chocoagua @mikf

Instagram will gives you lower resolution medias on non-login session. This is the case for downloading medias or just want to view it through the app/site

mikf commented 9 months ago

What happens if I choose not to use cookies?

IG will redirect all requests to its login page after a (very) short while.

[urllib3.connectionpool][debug] https://www.instagram.com:443 "GET /accounts/login/?next=/api/v1/media/1875629777499953996/info/ HTTP/1.1" 200 None
[instagram][error] HTTP redirect to login page (https://www.instagram.com/accounts/login/)

(As for your initial question: I don't know. Maybe consider using instaloader instead of gallery-dl)

ashes-xda commented 9 months ago

(As for your initial question: I don't know. Maybe consider using instaloader instead of gallery-dl)

oh I tried instaloader initially but that was way too slow and always used to give forbidden error after every few minutes whereas gallerydl is super fast no comparison there. The only issue is getting a few errors these days, which I never faced before and I've been scraping for almost 6 months. I'll try using the sleep and sleep-request options and see how it goes thankyou for such an awesome program

dademiller360 commented 7 months ago

I'm having my accounts banned frequently even with sleep and sleep-request set, using browser cookies and vpn, what is your experience? thanks

Hrxn commented 7 months ago

VPN does not help here, it likely makes it worse.

dademiller360 commented 7 months ago

true, but if meta ban my ISP IP (static) it's going to be way worse :-(( I tried to login again into my account and it seems that it wasn't banned, but gallery-dl with the netscape cookies txt was (after few download) redirecting me to the login page (as an error) I refresh manually the cookies and it worked again. I was looking into the docs but I could not find any info about cookies.txt refresh, should I do something like this?

nothing2obvi commented 2 months ago

@ashes-xda @dademiller360

Any of you find any solutions? I'm using the following settings, but they're not helping:

"sleep-request": [30.0, 60.0],
"sleep-429": [60.0, 90.0],
"sleep": [30.0, 60.0],
"sleep-extractor": [30.0, 60.0],