Open Jonathan-Adly opened 2 weeks ago
I will keep this issue open as I am not fully committed on won't fix vs. proxy url vs. commit to take whatever URL the user sends. Most medical publishers have cloudflare on, which still blocks with proxy. (even if the study is open and free to download).
Proxies aren't that useful because of small file limitations, and committing to take whatever URL the user sends means instead of focusing on AI pipelines, we would be play grayhat again cloudflare.
For now - added better error messages so people know why it didn't work.
I spoke with the folks at Scraper API. They will support any ongoing issues for us, so we can commit to accepting whatever the users send.
@Abdullah13521 Here is the plan.
use_proxy
Boolean - defaults to False.# example proxy_url = "http://scraperapi.ultra_premium=true:my-api-key@proxy-server.scraperapi.com:8001"
# proxy could be none, and it won't through any errors.
async with aiohttp.ClientSession() as session:
async with session.get(url, proxy=proxy, ssl=False) as response:
On cloud - if use_proxy is used, we will add 10 credits to usage. (so, number of pages + 10).
Let me know if you have any questions.