VizierDB / vizier-scala

The Vizier kernel-free notebook programming environment
Other
34 stars 11 forks source link

Merge Datasets Cell #34

Open okennedy opened 5 years ago

okennedy commented 5 years ago

It would be useful to have a more robust interface for the Schema Matching cell, specialized for the most common use case: Merging datasets from different sources together.

This cell would need to:

  1. Allow users to define an output schema
  2. Allow users to map columns from two or more tables to the output schema
  3. (maybe) allow for some limited transformation? (although probably not... this can just be added as another workflow step)

Ideally, we'd have a way to infer best guesses for 1 and 2 (e.g., based on the same tricks we're using in the Schema Matching cell right now).

okennedy commented 3 years ago

Some design sketches:

image

image