scrazzz / redgifs

Simple Python API wrapper for the RedGIFs API
https://redgifs.rtfd.io
MIT License
91 stars 12 forks source link

API.download() hanging #9

Closed CelebStalker closed 1 year ago

CelebStalker commented 1 year ago

When downloading multiple files with API.download(), it seems to pause without an exception, causing the program to halt.

Maybe it's something related to http.get() with default timeout very large.

This issue doesn't seem to happen with same link when retrying the program again, but some other link usually gets stuck. All the links used are generated from api.get_gif(rid).urls.hd

Is there a way to skip a file from downloading when time consumed by it is too high?

scrazzz commented 1 year ago

I need to see an example of the code you are using to help you out. Are you using this code in a synchronous or an asynchronous context?

CelebStalker commented 1 year ago

Not familiar with Asynchronous mode, probably using it in Synchronous.

import pandas as pd
from redgifs.api import API

api=API()
dir_name=r"C:\....."    # Path to download folder

rdf2=pd.read_csv("Redgifs DataFrame.csv")

for url, file_name in zip(rdf2["API URL"], rdf2["File Name"]):
    api.download(url, f"{dir_name}\\{file_name}.mp4")

API URL is generated from list of ids, using, api.get_gif(rid).urls.hd

The issue happens rarely, but when it does happen, the program seems to keep waiting for eternity (maybe due to large default timeout). I don't think it's a WiFi issue, as Task Manager shows no WiFi activity when download is running (hang state), whereas other webpages load fine in background at the same time.

I use this program on Android, but found similar scenario on Windows too. Is there a way to skip download when a pre-set time is reached while downloading?

scrazzz commented 1 year ago

I don't get any issues with the API.download() method. It's not slow for me.

import time
import redgifs

api = redgifs.API()
gifs = api.search('hitomi tanaka', count=15).gifs

n = 1
for gif in gifs:
    start = time.process_time()
    api.download(gif.urls.hd, f'dls/vid_{n}.mp4')
    print(f'downloaded {n} in {time.process_time() - start:.2f}s')
    n += 1

Screenshot

All the videos gets downloaded within 3 seconds.

scrazzz commented 1 year ago

Is there a way to skip a file from downloading when time consumed by it is too high?

I'm not too sure about this. But I'm guessing I will have to do something about this internally within the library which isn't feasible.

CelebStalker commented 1 year ago

Downloads are pretty fast, I actually depend on this library since the website started validation. The issue occurs when there are plenty of downloads like 500+. I download from a list of URLs accumulated over a week at once. The issue occurs mostly 400+ but it can be random. I had a similar problem with my own code once which I solved with, http, get(url, timeout=10) . Exceptions are preferable than a pause in program is what I feel. If the issue isn't replicable then it might have something to do with my device.

scrazzz commented 1 year ago

Yes, you're right. The default timeout for requests is None which means the program will wait forever until it gets the response from the server. Maybe I should set it to 60 seconds.

For aiohttp the default timeout is 5 minutes. So there won't be any issues when using asynchronous code.

I will fix this issue, thanks for reporting.