zazuko / cube-creator

A tool to create RDF cubes from CSV files
GNU Affero General Public License v3.0
14 stars 2 forks source link

Datatyp error by value 2.47812E-05 #1283

Closed ortnever closed 2 years ago

ortnever commented 2 years ago

Is there a range of allowed values for the decimal type ? I have a value of 2.47812E-05 in the project UBD000501 Treibhausgasemissionen nach Sektoren (CO2-Verordnung) and the transformation fails.

In my CSV the value appears in the non-scientific notation, i.e. 0.000024782 but the transformation blocks on this line. https://gitlab.ldbar.ch/pipelines/cube-creator/-/jobs/34003

image

ortnever commented 2 years ago

Regardless of the allowed value range, is scientific notation, for example, E-05 accepted by the transformation?

tpluscode commented 2 years ago

The value is the same, regardless of the notation. This should not be failing.

Rdataflow commented 2 years ago

@tpluscode might be related to datatype.

IIUC scientific notation is only available for xsd:double

cc @l00mi

tpluscode commented 2 years ago

You are right, scientific notation is reserved for floating point numbers (float and double). I spent some time trying to find a possible bug in the assumption that the pipeline incorrectly transforms that number

Instead, I found that row 622 does in fact contain "2.47812E-05" in the value column. It must have been changed when exporting source data to CSV. The pipeline is correct to signal a problem.

Screenshot 2022-09-05 at 09 58 11
l00mi commented 2 years ago

For context: @ortnever and @Rdataflow we did provide the datatypes xsd:float and xsd:double. But this was confusing for the data providers (mentioned by @FabianCretton). As 99% are covered by xsd:decimal we only kept this datatype. In this case I propose to fix the data source. If the need arises we can include these datatypes again.

ortnever commented 2 years ago

Thank you for your research and sorry for the confusion. I told you that in my CSV the value appeared in the non-scientific notation and it was the case. Unfortunately the file was not really replaced because I did not notice that one column had a different name.

It is clearly written by upload that the column names should not be changed. However, if this is the case, the user do not receive any error message and everything happens as if the file had been replaced. This is dangerous in my opinion, the user should receive an error message. Wasn't this the case before?

ortnever commented 2 years ago

@tpluscode I noticed that the issue is closed. Did you see my comment?

tpluscode commented 2 years ago

Yes I have. Let me create a new issue for investigating the behaviour your describe