CODAIT / text-extensions-for-pandas

Natural language processing support for Pandas dataframes.
Apache License 2.0
215 stars 34 forks source link

Fully interactive visualization supporting editing of the DataFrame #204

Closed frreiss closed 2 years ago

frreiss commented 3 years ago

Extend the DataFrame visualization from #203 to support making edits to the DataFrame itself to support data cleaning and active learning applications. Major types of edits that would be useful:

With this editing support, it should be possible to do a data cleaning application end-to-end, entirely in the notebook. Early cells in the notebook load the data, maybe train a model on it, and generate a DataFrame with information about what was found in each document. Then the user edits the DataFrames in place. Then later cells in the notebook consume the results from the editing session and do things like retraining the model or writing out a new, corrected data set.

This kind of interactivity and interplay between the JavaScript and the backing Python objects will require a JupyterLab widget.

frreiss commented 2 years ago

238 implements most of this functionality, so closing this issue.