CODAIT / text-extensions-for-pandas

Natural language processing support for Pandas dataframes.
Apache License 2.0
215 stars 34 forks source link

Extend HTML visualization to cover entire DataFrames. #203

Open frreiss opened 3 years ago

frreiss commented 3 years ago

Extend the widget to also display entire DataFrames with one or more columns of span data, similar to how display.render() displays spans of multiple types at the same time.

The type/tag associated with the spans should be configurable, and should be drawn either from the names of columns in the DataFrame (for example, if there are columns "arg1" and "arg2"), or from data from the same row in a separate column (for example, "entity_type" with values like "Person", "Org", etc.).

Since Pandas DataFrames already have a _repr_html_ method, the API for this new viz should be a function that you pass a DataFrame through, similar to how displacy.render() takes a SpaCy Doc object as input and returns a new object with IPython display hooks.