Closed rufuspollock closed 6 years ago
Working delimiters:
FIXED, available from v0.7.0 of the CLI tool - try the latest version here http://datahub.io/download
Manual test with multi-delimeters dataset (from test-data
repo) FAILED
File "/usr/local/lib/python3.6/site-packages/tableschema/schema.py", line 138, in cast_row
raise exceptions.CastError(message)
tableschema.exceptions.CastError: Row length 1 doesn't match fields count 3
May be the test dataset is not valid, anyway we need to investigate.
Now works, it was an invalid descriptor in a test dataset
FIXED:
User could push datasets and files with different delimeters.
As a Publisher I want to be able to provide a CSV with semicolon delimiters or other csv variations and have the app handle this automatically so that I don't have to set this stuff by hand in a datapackage.json
See e.g. https://github.com/datahubio/qa/issues/35
Acceptance criteria
Tasks
Analysis
All cases failed, except
comma
separated file. Please, check url https://datahub.io/Mikanebu/test-data-for-different-separators/v/2.Sample is located here: https://github.com/Mikanebu/qa-test-datasets Other samples: https://github.com/frictionlessdata/test-data/tree/master/data-files/csv/separators
Related issue: https://github.com/datahq/datahub-qa/issues/35
Also, if there is datapackage.json, that describes all files:
Variations
We want following delimiters to be supported:
[',', ';', ':', '|', '\t', '^', '*', '&']
We also want to guess quote character along with delimiter - but still default to double quoteAs per line endings, we don't want to do anything as CSV parser library we're using handles it internally.
Identify relevant place for its use
this should be implemented in
data.js
library:/parser/csv.js
which is the most relevant place