microsoft / SandDance

Visually explore, understand, and present your data.
https://microsoft.github.io/SandDance
MIT License
6.35k stars 524 forks source link

Inference on rows #233

Open jkelleyrtp opened 4 years ago

jkelleyrtp commented 4 years ago

Is there a way to force SandDance to perform inference on all rows instead of just a sample, without reinventing the wheel here?

There's getColumnsFromData, but this samples just the first row. Our rows don't all have the same data and the first row is pretty light.

/**
 * Derive column metadata from the data array.
 * @param data Array of data objects.
 */
export function getColumnsFromData(data, columnTypes) {
    const sample = data[0];
    const fields = sample ? Object.keys(sample) : [];
    const inferences = Object.assign(Object.assign({}, VegaDeckGl.base.vega.inferTypes(data, fields)), columnTypes);
    const columns = fields.map(name => {
        const column = {
            name,
            type: inferences[name]
        };
        return column;
    });
    inferAll(columns, data);
    return columns;
}

I could just run getColumnsFromData on every row, I guess, and then pass them into like #158?

Just curious if there's a more elegant way of going about it.

danmarshall commented 4 years ago

I think you'll need to pass them in, since there may be custom logic when a conflict exists when 2 rows produce different inferences on the same column...

jkelleyrtp commented 4 years ago

I got some of it working with the column injection from #158, but when it goes to render the view I get "Must set a field for x axis" which I think comes from this bit of code:

    updateViewerOptions(viewerOptions) {
        this.viewerOptions = Object.assign(Object.assign({}, SandDance.VegaDeckGl.util.deepMerge(defaultViewerOptions, this.viewerOptions, viewerOptions)), { tooltipOptions: {
                exclude: columnName => this.state.tooltipExclusions.indexOf(columnName) >= 0
            }, onColorContextChange: () => this.manageColorToolbar(), onDataFilter: (dataFilter, filteredData) => {
                const selectedItemIndex = Object.assign({}, this.state.selectedItemIndex);
                selectedItemIndex[DataScopeId.FilteredData] = 0;
                this.changeInsight({ filter: dataFilter, filteredData, selectedItemIndex });
                if (this.state.sideTabId === SideTabId.Data && this.state.dataScopeId === DataScopeId.FilteredData) {
                    //make sure item is active
                    // This one
                    requestAnimationFrame(() => filteredData && this.silentActivation(filteredData[0]));
                }

The first item in the data might not have the columns I selected, and when it doesn't it's not able to render it.

An example set:

let data = [
   item1 = {"a": 0, "b":0},
   item2 = {"a": 0, "b":0, "c":0}
];

And if I filter by "c", then I get the "must set a field for x axis". However, this does work:

let data = [
   item1 = {"a": 0, "b":0, "c":0},
   item2 = {"a": 0, "b":0, "c":0},
   item2 = {"a": 0, "b":0}
];

So there's some sort of null handling, just not on the first row.

Have any ideas for a workaround?

danmarshall commented 4 years ago

Perhaps the quickest workaround is if you can do this check on the data, and add necessary properties to the first element.