mjishnu / pypdl

A concurrent pure python downloader with resume capablities
https://pypi.org/project/pypdl/
MIT License
51 stars 9 forks source link

Error when downloading from servers that doesn't support Head requests #15

Closed schoulten closed 6 months ago

schoulten commented 6 months ago

I'm getting errors when trying to download some CSV files, even though there are no problems in the URL (tested via browser).

Is this something I'm doing wrong? Any tips on how to debug this?

Code:

from pypdl import Downloader
dl = Downloader()
dl.start('https://www.gov.br/anp/pt-br/centrais-de-conteudo/dados-abertos/arquivos/shpc/dsas/ca/ca-2004-01.csv')

Result: ERROR:root:(ConnectionError) [Server Returned: Forbidden(403), Invalid URL]

mjishnu commented 6 months ago

this is caused because the head request failed. pypdl first sends a head request to get metadata and to ensure the file exist or not apparently the link you provided doesn't implement the head request support or they don't allow it this cause the head request that pypdl send to fail giving you the error.

its quite an easy fix we just need to add code to send a get request if head request fails this will fix the issue. thanks for reporting this was a bug that i didn't anticipate.

Edit: with v1.3.2 this should be fixed. also the server you are trying to download from seems to have issue with multi segment download

schoulten commented 6 months ago

I can confirm that it's working smoothly now with v1.3.2. Thanks a lot!