causalens / dara

Dara is a dynamic application framework designed for creating interactive web apps with ease, all in pure Python.
https://dara.causalens.com
Apache License 2.0
372 stars 8 forks source link

Improvement: DO-3494: Automatically show index columns for Table, Support duplicate column names, Fix invisible boolean values in Table cells #353

Closed Roman-Kornev closed 2 months ago

Roman-Kornev commented 2 months ago

Motivation and Context

All of the above issues have been fixed. Any index columns are now automatically included in the table and pinned to the left. Both multi-index rows and hierarchical columns are also supported. The dtype information is propagated to the column level, automatically rendering all supported types, including DateTime.

Other small issues fixed:

Implementation Description

In order to not break any existing APIs relying on orient='records' format, which omits the index information, the DataFrame is pre-processed internally by including a prefix __index__{i}__ or __col__{i}__ to a all columns and sent over the network.

When the table is rendered, the prefixes are stripped and used as col_id to map to the underlying data: __col__2__name -> name. This means that getData() call for column resolution now needs to happen when the columns are user-provided as well, but the table still shows the provided columns immediately. One small limitation is data returned by onClickRow() handler. Since it returns data as an object, any duplicate keys will be lost. This is preserved from before to not break any existing APIs.

Multi-index rows are included as new columns on the left, whereas hierarchical columns ('P', 'alpha'), ('Q', 'beta')] are concatenated with _ before rendering.

A new endpoint /api/core/data-variable/${uid}/schema is responsible for column type information. It it set any time data is retrieved. An optional { schema: boolean } options field is added to useDataVariable to control whether the schema should be included in the response.

Any new dependencies Introduced

How Has This Been Tested?

Tested locally. Fixed the tests and added test for internal format conversion.

PR Checklist:

Screenshots (if appropriate):

image

Duplicate column names:

df = pd.DataFrame({
        'A': [1, 2, 3],
        'B': [1.1, 2.2, 3.3]
    })
df2 = pd.concat([df, pd.DataFrame({
        'A': ['x', 'y', 'z'],
        'B': [1.1, 2.2, 3.3]
    })], axis=1)
df2

BEFORE: image AFTER: image

Multi-index rows and columns

index = pd.MultiIndex.from_product([['A', 'B'], ['X', 'Y', 'Z']], names=['letter', 'symbol'])
columns = pd.MultiIndex.from_product([['P', 'Q'], ['alpha', 'beta']], names=['group', 'subgroup'])

df = pd.DataFrame(np.random.randn(6, 4), index=index, columns=columns)
df

BEFORE: image AFTER: image