Open fabiocs8 opened 3 years ago
There are several issues going on here.
curl
is not able to download "https://www.portaltransparencia.gov.br/download-de-dados/despesas-execucao/202001"
but switching from HTTPS to HTTP solves this one and I would rather see this as an issues of curlWhen the issue of downloading is solved by switching to HTTP with fread("http://www.portaltransparencia.gov.br/download-de-dados/despesas-execucao/202001")
another one pops up:
What you expect of fread
is to automatically detect the filetype without file ending but that's not something fread
does.
Your file is a .zip which is not supported by fread yet, see also #3834
Thank you Ben.
When I run fread with verbose = TRUE (output in SO post link above), I understand that fread do download the file with no problem. However, the problem happens when decompressing it: because Windows interpret it as a a binary file, it changes '\n' line endings to '\r\n' (aka 'CRLF'), see the excelent answer provided by r2evans in SO. Using download.file ( .. , mode = "wb") is enough to solve this issue, and unzip works properly.
Amazingly, fread code in line 87 instructs curl with the option mode = "wb": curl::curl_download(input, tmpFile, mode = "wb", quiet = !showProgress)
So it seems that this mode option has no effect here....
Regards, Fabio.
As per my post in SO, fread cannot import and unzip the following URL:
dt <- fread("https://www.portaltransparencia.gov.br/download-de-dados/despesas-execucao/202001")
The work around was to read the url imposing mode = "wb" :
download.file("https://www.portaltransparencia.gov.br/download-de-dados/despesas-execucao/202001" , destfile = "test_file.zip" , mode = "wb")
unzip("test_file.zip", exdir = "."
It would be nice if fread provide an option to deal with cases like this.