frictionlessdata / tableschema-js

A JavaScript library for working with Table Schema.
http://frictionlessdata.io/
MIT License
82 stars 27 forks source link

date parsing #156

Closed jgranduel closed 5 years ago

jgranduel commented 5 years ago

Hi,

just discovering the project. I'm stucked with date parsing. Here are my files : a.csv

"date"
"2019-06-04"
"2019-01-01"

a.schema.json

{
    "fields": [
        {
            "name": "date",
            "type": "date",
            "format": "default"
        }
    ],
    "missingValues": [
        ""
    ]
}

loading in node:

> let t = await Table.load('./a.csv', { schema: './a.schema.json' })
> t.read({ keyed: true })
[ { date: 2019-06-03T22:00:00.000Z },
  { date: 2018-12-31T23:00:00.000Z } ]

Where could the error come from?!

I've tried format "YYYY-MM-DD", "%y-%m-%d" with no success, default should works anyway. How should I format date by default (ISO 8601 seems a good idea). Thanks in advance

roll commented 5 years ago

@JDziurlaj Hi, it seems here is a problem with a delimiter detection. It works for me if I provide it explicitly:

const {Table} = require('../src')

async function main() {
  const table = await Table.load('tmp/issue156.csv', {schema: 'tmp/issue156.json', delimiter:','})
  const rows = await table.read({keyed: true})
  console.log(rows)
}

main()
  .then(result => console.log(result))
  .catch(error => console.log(error))
jgranduel commented 5 years ago

Hi, thanks for your answer... But it doesn't change anything on my Win10/node-12.4.0 version. Even with explicit delimiter set to ",", I still get

 [
  { date: 2019-06-03T22:00:00.000Z },
  { date: 2018-12-31T23:00:00.000Z }
]

As there's only one column, delimiter shouldn't be a concern anyway! Sorry, but what's the chaining of the parsing of the date? Thx

roll commented 5 years ago

Sorry) Than I don't understand. What error? That's the expected output.

jgranduel commented 5 years ago

well the date is : "2019-06-04" -> "2019-06-03T22:00:00.000Z" or "2019-01-01" -> "2018-12-31T23:00:00.000Z... Oh, my! you're right: UTC issue, 1 or 2 hour shift winter/summer.

let d = new Date()
d.getTimezoneOffset() -> -120
d.toLocaleDateString("fr-FR")
'2019-6-11'

Is there any way to force getting the "same original date" 2019-06-04 -> 2019-06-04 . Maybe I've overlooked something in 'format' option that will be parsed? Thanks and sorry for the confusion!

roll commented 5 years ago

Yea I'd seen an error and was looking for something like an exception. Datetimes is a real problem for JavaScript.

The specs say nothing about it - https://frictionlessdata.io/specs/table-schema/#date. But the ISO8601 says that it's local time:

Time zones in ISO 8601 are represented as local time (with the location unspecified), as UTC, or as an offset from UTC.

So the lib acts correctly here. Let me check how to achieve your goal using formats.

roll commented 5 years ago

This one works for me:

const {Table} = require('../src')

async function main() {
  const data = [
    ['date'],
    ['2019-06-04Z'],
    ['2019-06-01Z'],
  ]
  const schema = {
    fields: [
      {name: 'date', type: 'date', format: '%Y-%m-%d%Z'},
    ]
  }
  const table = await Table.load(data, {schema, delimiter:','})
  const rows = await table.read({keyed: true})
  return rows
  // [ { date: 2019-06-04T00:00:00.000Z },
  //   { date: 2019-06-01T00:00:00.000Z } ]

}

main()
  .then(result => console.log(result))
  .catch(error => console.log(error))
jgranduel commented 5 years ago

Thanks a lot!