mithrandie / csvq

SQL-like query language for csv
https://mithrandie.github.io/csvq
MIT License
1.49k stars 65 forks source link

Feature request : parameter for decimal point #91

Open kpym opened 1 year ago

kpym commented 1 year ago

csvq provides -d and -D for the delimiter separator. But very often, the delimiter is set to ; when the decimal point is , (and can't be used as field separator). So it is natural to provide a parameter for the decimal point separator. This will allow for example to use csv files exported from excel in many countries where the decimal point is , (the green ones on this map).

(this is related to #90)

githubuser181226 commented 1 year ago

I concur. This would be great, but I suspect you can't simply make csvq take in floating point values with commas, because just like in regular SQL, comma is used to separate command/operator parameters, so 3,14 would be parsed as 3 and 14 thus we have to use 3.14.

However, if it was possible to make it so with changing the floating point from . to , and separator from , to ; this also meant ; was used instead of , by command parser, then that would be optimal.

At the very least, though, IMO, csvq should allow to format output to use comma as floating point by default Cheers.

mithrandie commented 1 year ago

Certainly this feature would be better supported, but for the reasons mentioned by @githubuser181226 , I cannot immediately determine if it can be implemented.

For output, the function NUMBER_FORMAT can be used to convert floating point values into strings that uses the character specified as the decimal point.

csvq > SELECT 123456.7890123
+----------------+
| 123456.7890123 |
+----------------+
| 123456.7890123 |
+----------------+

csvq > SELECT NUMBER_FORMAT(123456.7890123, 5, ',', '.', ' ')
+-------------------------------------------------+
| NUMBER_FORMAT(123456.7890123, 5, ',', '.', ' ') |
+-------------------------------------------------+
| 123.456,789 01                                  |
+-------------------------------------------------+
kpym commented 1 year ago

I have changed my mind, we don't need new parameter. The solution could be simply to let -d and -D accept strings in the following format

'<field separator><decimal separator><thousands separator>'

and to set the missing separators accordingly to the present ones, as follows:

The three separators should be different.

It will be nice if -d and -D could accept also a country, like 'fr'. For this (at least for -D) the library golang.org/x/text/message can be used to produce the result.

And, I know I'm exaggerating, may be -d could accept auto to choose between , and ; (the two most common). github.com/csimplestring/go-csv can be used as starting point. I haven't checked if this library is correct or not, sorry. In python there is csv.Sniffer which can be used to guess the separators. And if auto is done in a way to choose , if possible, it can be set as default (in place of ',').

kpym commented 1 year ago

If -d auto is accepted, it can resolve #85 automatically.