frictionlessdata / tableschema-js

A JavaScript library for working with Table Schema.
http://frictionlessdata.io/
MIT License
82 stars 27 forks source link

Add an `infer` mode guarantying that data sample is valid against an inferred schema #111

Open anuveyatsu opened 6 years ago

anuveyatsu commented 6 years ago

By default, infer method reads only first 100 rows when generating schema. However, there is some situations when we need to increase that limit, e.g., when first 100 rows contain integers, but then there is a decimal numbers. If I set limit option when calling infer, it still returns schema that is not correct. May be I am doing something wrong:

https://runkit.com/anuveyatsu/tableschema-infer-not-working-properly

as you can see field type for "Value" is "integer", however, there are decimal numbers (eg, in row 215)

roll commented 6 years ago

@anuveyatsu I don't think it's a good term to name an inferred schema as correct or not correct. An inferring is intended to be a schema bootstrap step. And now it uses a fast algorithm based on type/format confidence.

I think it's better to re-formulate this issue to support infer mode that guarantee that provided sample is a valid against an inferred schema.