Arkoniak / UrlDownload.jl

Julia url downloader with progress meter
MIT License
23 stars 0 forks source link

Excel support? #19

Open PyDataBlog opened 4 years ago

PyDataBlog commented 4 years ago

Excel is one of the most popular data formats so it'd be great to have it supported like:

using UrlDownload
using DataFrames 

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online%20Retail.xlsx"

df = urldownload(url) |> DataFrame
Arkoniak commented 4 years ago

Most popular package for excel files is XLSX.jl. Unfortunately, it doesn't support raw data. There is an issue https://github.com/felipenoris/XLSX.jl/issues/26, but it is not resolved currently.

PyDataBlog commented 4 years ago

Most popular package for excel files is XLSX.jl. Unfortunately, it doesn't support raw data. There is an issue felipenoris/XLSX.jl#26, but it is not resolved currently.

We can only wait for the issue to be fixed now.

Arkoniak commented 4 years ago

Current workaround (with UrlDownload v0.3.0)

using UrlDownload
using DataFrames 
using XLSX

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00352/Online%20Retail.xlsx"
outfile = "/tmp/online_retail.xlsx"

urldownload(url, parser = identity, save_raw = outfile)
df = DataFrame(XLSX.readtable(outfile, "mysheet")...)