okfn / opendataeditor

The Open Data Editor (ODE) is a no-code application to explore, validate and publish data in a simple way. Forever free and open source project powered by the Frictionless Framework.
http://opendataeditor.okfn.org
MIT License
182 stars 21 forks source link

Problems with date, datetime formats #594

Open cristinaclaverol opened 3 weeks ago

cristinaclaverol commented 3 weeks ago

ODE reads date formats like 4/1/2022 as a string even though the date format of that cells is assigned by datatime in the Excel. When you modify it in the metadata it detects an error because it does not recognize the data as a date.

romicolman commented 3 weeks ago

Hi @cristinaclaverol can you please share an extract of the file you ingested so we can run the tests?

romicolman commented 2 weeks ago

Here is the file that @cristinaclaverol dades_test_02 - Hojas de c√°lculo de Google.csv to the app.

Steps that I followed:

Image

Image

I also checked the file on Numbers (Mac software for tabular data). I see that the column is also formatted as date.

Any idea of why the ODE is taking that column as STRING?

pdelboca commented 1 week ago

@romicolman I think this is a question related to frictionless-py since we are using it to read the data and the schema. I would venture to answer that ODE is loading the column as a string because the date is not in ISO8601 format (YYYY-MM-DD).

A common data friction is that dates are usually written in different formats like YYYY/MM/DD or DD/MM/YY or MM/DD/YY or even YYYY-MM-DD (notice that I changed / for -). According to the Table Schema of the Data Package dates needs to be in YYYY-MM-DD format. Since this data is DD/MM/YY then it is considered as a string.

This is not an error nor a problem with the tool but rather a data friction that we need to solve/explain.

The specification of Data Package has a definition on what a date should look like, so in order to be able to define it as a date in Open Data Editor (that it is based in Data Package) the user should change the values to respect the format of the specification (YYYY-MM-DD).

cristinaclaverol commented 1 week ago

Ok, thanks for the explanation! The topic of the format of numerical data is very broad and complex, so it is a common source of problems when working with tables. Maybe it is not the objective of the tool but marking inconsistencies in the numerical formats or perhaps including a warning when numerical data appears that the pp does not recognize would be valuable information. I include this suggestion in the documentation Cristina Claverol

El mié, 30 oct 2024 a las 9:43, Patricio Del Boca @.***>) escribió:

@romicolman https://github.com/romicolman I think this is a question related to frictionless-py since we are using it to read the data and the schema. I would venture to answer that ODE is loading the column as a string because the date is not in ISO8601 format.

A common data friction is that dates are usually written in different formats like YYYY/MM/DD or DD/MM/YY or MM/DD/YY or even YYYY-MM-DD (notice that I changed / for -). According to the Table Schema of the Data Package https://specs.frictionlessdata.io/table-schema/#date dates needs to be in YYYY-MM-DD format. Since this data is DD/MM/YY then it is considered as a string.

This is not an error nor a problem with the tool but rather a data friction that we need to solve/explain.

The specification of Data Package has a definition on what a date should look like, so in order to be able to define it as a date in Open Data Editor (that it is based in Data Package) the user should change the values to respect the format of the specification (YYYY-MM-DD).

— Reply to this email directly, view it on GitHub https://github.com/okfn/opendataeditor/issues/594#issuecomment-2446199751, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEBC4DJPQXUHFITEXAXKX5DZ6CLZ3AVCNFSM6AAAAABPWVFMN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBWGE4TSNZVGE . You are receiving this because you were mentioned.Message ID: @.***>

loleg commented 6 days ago

Huge opportunity to do the community a service in this area. I even suggested to call the project the "Open Date Editor" .. it's such a common issue that even if we just do this right we will save the day for many people.

Technically, the moment.js library is one among many excellent technical projects specializing in this, and we should consider "strongly" adopting them - not just implementing the library, but also having a conversation with their community etc.