allenai / objaverse-xl

🪐 Objaverse-XL is a Universe of 10M+ 3D Objects. Contains API Scripts for Downloading and Processing!
https://objaverse.allenai.org/
Apache License 2.0
704 stars 40 forks source link

Download error: shell only #30

Open anikimmel opened 6 months ago

anikimmel commented 6 months ago

Hello!

image

I have a download error that occurs only when I try to download in any shell (Windows, Ubuntu) etc.

I have a fix, which is to add headers to the request that mimic the request sent out by colab (it is not reproducible in colab because the headers appear to be modified by colab itself).

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 Edg/121.0.0.0'}
response = requests.get(url, headers=headers, stream=True)

However, I'd rather not have to edit thingiverse.py myself, it'd be nice if it worked from the pip install alone without modifications.

Thanks!

mbanani commented 6 months ago

Hi @anikimmel, are you able to download more than a few objects before getting the errors again? It seems like there's a limit per user on ability to download.

anikimmel commented 6 months ago

I haven't tried recently but I was able to download 100 objects when I added headers to it. Without the headers, nothing would download.

SuX97 commented 5 months ago

@anikimmel Hi! I have tried your header but still get 403 from thingiverse, any updates? Thanks!

anikimmel commented 4 months ago

Hey, so what I do pretty much daily is I run one of the URLs in my browser the inspect it to see what headers it sends:

So I input https://www.thingiverse.com/download:9006116 to the browser then use the inspect element to see this:

image

Then I copy all the parameters exactly and put it into my download script. It works for me, but it seems like I have to update it every day.

SuX97 commented 4 months ago

Hey, so what I do pretty much daily is I run one of the URLs in my browser the inspect it to see what headers it sends:

So I input https://www.thingiverse.com/download:9006116 to the browser then use the inspect element to see this:

image

Then I copy all the parameters exactly and put it into my download script. It works for me, but it seems like I have to update it every day.

Hi, @anikimmel , By 'all the parameters', do you mean only the 'User-Agent' field only, or all others including 'Cookie', 'Sec-Ch-Ua', 'Sec-Ch-Ua-Mobile', etc? Thanks!

anikimmel commented 4 months ago

I put in everything from sec-ch-ua all the way down to user-agent. IDK it seems to work, I think it tricks the site into thinking it is coming from a web browser and not a shell script.