JuliaData / DelimitedFiles.jl

A package for reading and writing files with delimited values (Originally a Julia stdlib)
MIT License
18 stars 2 forks source link

readdlm doesn't know about comma decimal mark #11

Open Ilya87 opened 10 years ago

Ilya87 commented 10 years ago

In some countries comma is a decimal mark, but readdlm has no option to recognize it. So when I do x=readdlm("/home/ilya/Works/Julia/12.csv", ';') for file that contains numbers like 18,8205, then mean(x) ERROR: + has no method matching +(::SubString{ASCIIString}, ::SubString{ASCIIString}) Trying do x=readdlm("/home/ilya/Works/Julia/12.csv", ';',Float32) ERROR: file entry "18,8205" cannot be converted to Float32 I should open 12.csv file in editor and replace all commas with point and only this solves the problem.

tanmaykm commented 10 years ago

readdlm uses float64_isvalid which finally uses the C locale.

related: JuliaLang/julia#6593

elextr commented 10 years ago

If locale capability is added to things like readdlm it must be explicit, just because a computer is running in a ',' or '.' locale does not mean the data is in that format.

And the default should not be to use the computer's locale, it is really bad Karma if a program which works in one locale mysteriously fails in another when processing the same data.

nalimilan commented 10 years ago

@elextr It's not what @tanmaykm said. On the contrary, readdlm always uses the C locale, which prevents the behavior from changing depending on the current locale. What is needed is a way to specify a custom decimal mark, as you asked.

elextr commented 10 years ago

@nalimilan indeed, I said "if locale capability is added", rather than referring to current behaviour :)

As @Ilya87 pointed out there is data with differing locale specific formats, and maybe its useful to add the capability to accept such data (rather than having to edit the file to replace the commas with dots).

I was just pointing out that it needs to be explicit depending on the locale format of the data not the computer locale since its a common mistake to conflate the two.

tknopp commented 10 years ago

This needs a fix of JuliaLang/julia#6593 which I could tackle. The question is more how the interface of these locale specific functions should look like. Lets keep that dicussion in JuliaLang/julia#6593.