BdR76 / CSVLint

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.
GNU General Public License v3.0
149 stars 8 forks source link

Validate Data does not recognize scientific number format #45

Open Friedi opened 1 year ago

Friedi commented 1 year ago

Hi,

it seems the classification as FLOAT does not recognize scientific number formats like 1.24E-3. (just for info: 1.24E-3 = 1.24*10^(-3) = 0.00124) ** error line 2746: Column 3 value "5.55462962962963E-7" not a valid decimal value

It would be nice if it was supported or at least excluded as an error during validation.

Thanks

BdR76 commented 1 year ago

Thanks for posting this issue, you're right the plugin doesn't recognise floats in scientific notation. A similar issue still stands for currency values, like €1,234.50 or JPY 1.789.935.

The plug-in will treat both as a text/string value, so currently those kind of values cannot be validated. I'll look into this when I have the time, can't promise anything but I'll see what I can do.

Btw does your data contain floats in both scientific notation and non-scientific notation in the same column? So values like 0.12, 1.3 and 3.45E-8, 5.73E-10 etc. in the same column?

Friedi commented 1 year ago

Btw does your data contain floats in both scientific notation and non-scientific notation in the same column? So values like 0.12, 1.3 and 3.45E-8, 5.73E-10 etc. in the same column?

Thanks for your explaination. Yes the worst case scenario :) Both types are present in the same column (and in the same row). At least the Exponent is in my case always capital "E". There are some other data exports which use lower "e". I did not came across mixed scientific representation