JuliaStats / RDatasets.jl

Julia package for loading many of the data sets available in R
GNU General Public License v3.0
160 stars 56 forks source link

Eliminate CSV dependency #129

Open PallHaraldsson opened 2 years ago

PallHaraldsson commented 2 years ago

For this package all the CSV files are relatively small I think.

CSV is a heavy dependency, on startup. It pays off if you load large CSV files. I was thinking maybe you could rather use the really fast (built into Base), it seems even with compressed files:

julia> @time using DelimitedFiles
  0.004993 seconds (1.15 k allocations: 82.234 KiB, 78.42% compilation time)

https://docs.julialang.org/en/v1/stdlib/DelimitedFiles/

A downside could be: If in all real-world code you would load CSV anyway, then not effective (but neither slower this way). Another option could be, this package if often (not always) used with RCall (what I'm looking into now). And plausibly then at least you could rather use CSV reading from R than Julia's CSV. Or Python's tool where appropriate.