VizierDB / vizier-scala

The Vizier kernel-free notebook programming environment
Other
34 stars 11 forks source link

Interactive JSON shredder #225

Open okennedy opened 1 year ago

okennedy commented 1 year ago

What pain point is this feature intended to address? Please describe. Spark presently supports a very simplified approach to loading json data -- it assumes one json record per line, with a relatively flat schema. You can shred the schema further, but it requires forming relatively complex queries.

Describe the solution you'd like

  1. An interactive textual environment for shredding schemas (e.g., something like jqp). Crucially, something that lets you see/browse the input, provides a textual way to specify a shred, and that reactively updates a shredded view.
  2. After 1 is implemented, it would be useful to "suggest" shreds (e.g., something like JXplain.
  3. After 1 is implemented, and with careful study it would be useful to have some graphical widgets to automate declaring the shreds. For example, clicking on a dictionary key in the source or target data might add that path to the shred.

Describe alternatives you've considered Presently, shredding JSON data has to be done manually through repeated SQL queries.