Open aborruso opened 4 years ago
Love it!
The place where this would land is in the detect
package: https://github.com/qri-io/dataset/blob/d12a66b92250109b67cd1b74bca763baa0b847e4/detect/detect.go#L39-L47
We should add a FormatConfig
function to detect
that detects format configuration based on a data format. In the case of CSV files, it should sniff the delimiter.
it could also be used to clean up subsequent calls within the detect package itself, which uses a baseline format configuration for CSV files:
func CSVSchema(resource *dataset.Structure, data io.Reader) (schema map[string]interface{}, n int, err error) {
tr := dsio.NewTrackedReader(data)
r := csv.NewReader(replacecr.Reader(tr))
r.FieldsPerRecord = -1
r.TrimLeadingSpace = true
r.LazyQuotes = true
If detect.FromReader
infers & returns Structure.FormatConfig
, it'll bubble up into qri here and should "just work" https://github.com/qri-io/qri/blob/aed31e903d07af8e805d5290934e10f41e95ae21/base/dataset_prepare.go#L178-L188
What feature or capability would you like?
A lot of CSVs in the world are not separed by
,
. It would be great to infer the separator and make qri able to read every kind of CSVs.Do you have a proposed solution?
No but I add the python way to do it