Open cristinaclaverol opened 3 weeks ago
Hi @cristinaclaverol can you please share an extract of the file you ingested so we can run the tests?
Here is the file that @cristinaclaverol dades_test_02 - Hojas de c√°lculo de Google.csv to the app.
Steps that I followed:
I also checked the file on Numbers (Mac software for tabular data). I see that the column is also formatted as date.
Any idea of why the ODE is taking that column as STRING?
@romicolman I think this is a question related to frictionless-py
since we are using it to read the data and the schema. I would venture to answer that ODE is loading the column as a string because the date is not in ISO8601 format (YYYY-MM-DD
).
A common data friction is that dates are usually written in different formats like YYYY/MM/DD
or DD/MM/YY
or MM/DD/YY
or even YYYY-MM-DD
(notice that I changed /
for -
). According to the Table Schema of the Data Package dates needs to be in YYYY-MM-DD
format. Since this data is DD/MM/YY
then it is considered as a string.
This is not an error nor a problem with the tool but rather a data friction that we need to solve/explain.
The specification of Data Package has a definition on what a date
should look like, so in order to be able to define it as a date
in Open Data Editor (that it is based in Data Package) the user should change the values to respect the format of the specification (YYYY-MM-DD
).
Ok, thanks for the explanation! The topic of the format of numerical data is very broad and complex, so it is a common source of problems when working with tables. Maybe it is not the objective of the tool but marking inconsistencies in the numerical formats or perhaps including a warning when numerical data appears that the pp does not recognize would be valuable information. I include this suggestion in the documentation Cristina Claverol
El mié, 30 oct 2024 a las 9:43, Patricio Del Boca @.***>) escribió:
@romicolman https://github.com/romicolman I think this is a question related to frictionless-py since we are using it to read the data and the schema. I would venture to answer that ODE is loading the column as a string because the date is not in ISO8601 format.
A common data friction is that dates are usually written in different formats like YYYY/MM/DD or DD/MM/YY or MM/DD/YY or even YYYY-MM-DD (notice that I changed / for -). According to the Table Schema of the Data Package https://specs.frictionlessdata.io/table-schema/#date dates needs to be in YYYY-MM-DD format. Since this data is DD/MM/YY then it is considered as a string.
This is not an error nor a problem with the tool but rather a data friction that we need to solve/explain.
The specification of Data Package has a definition on what a date should look like, so in order to be able to define it as a date in Open Data Editor (that it is based in Data Package) the user should change the values to respect the format of the specification (YYYY-MM-DD).
— Reply to this email directly, view it on GitHub https://github.com/okfn/opendataeditor/issues/594#issuecomment-2446199751, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEBC4DJPQXUHFITEXAXKX5DZ6CLZ3AVCNFSM6AAAAABPWVFMN2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINBWGE4TSNZVGE . You are receiving this because you were mentioned.Message ID: @.***>
Huge opportunity to do the community a service in this area. I even suggested to call the project the "Open Date Editor" .. it's such a common issue that even if we just do this right we will save the day for many people.
Technically, the moment.js library is one among many excellent technical projects specializing in this, and we should consider "strongly" adopting them - not just implementing the library, but also having a conversation with their community etc.
ODE reads date formats like 4/1/2022 as a string even though the date format of that cells is assigned by datatime in the Excel. When you modify it in the metadata it detects an error because it does not recognize the data as a date.