UBOdin / mimir

Data-ish exploration through SQL+Uncertainty
http://mimirdb.info
Apache License 2.0
27 stars 13 forks source link

Revise Lenses into special operators (like they are in Adaptive Schemas) #344

Open okennedy opened 5 years ago

okennedy commented 5 years ago

Lens/Model Roles

  1. Tag data values as uncertain/problematic.
  2. Provide a human-readable description of the reason that the value is uncertain/problematic.
  3. Produce a "best guess" value (that is problematic)
  4. Provide an identifier to allow uncertain values to be acknowledged (un-tagged).
  5. Identify a procedure by which the user can address the error / repairing the value.
  6. Optionally provide a facility for sampling alternative values.
okennedy commented 5 years ago

Limitations of the Existing Approach

  1. A single opaque blob (the model) handles everything, so it needs to be serialized and passed around.
  2. Query structure is fixed at the time of lens creation (no option-driven rewrites)
  3. Feedback has to go through the model, when often a lighter-weight fix (e.g., manually replacing null values) exists.
  4. Sampling is not always feasible (e.g., no well-defined distribution exists, or the domain is infinite)
  5. Acknowledgements are handled by the big opaque blob, adding more junk to be serialized and passed around.