Previously, any Table(data=DataVariable(df)) would not show the index column when rendered.
Furthermore, when using Table with non-unique column names, the table would crash with Error: ServiceError: DataFrame columns must be unique for orient='records'.
Additionally, the table would not show any cells with a boolean value.
All of the above issues have been fixed. Any index columns are now automatically included in the table and pinned to the left. Both multi-index rows and hierarchical columns are also supported. The dtype information is propagated to the column level, automatically rendering all supported types, including DateTime.
Other small issues fixed:
Table columns field is now properly marked as optional.
Implementation Description
In order to not break any existing APIs relying on orient='records' format, which omits the index information, the DataFrame is pre-processed internally by including a prefix __index__{i}__ or __col__{i}__ to a all columns and sent over the network.
When the table is rendered, the prefixes are stripped and used as col_id to map to the underlying data: __col__2__name -> name. This means that getData() call for column resolution now needs to happen when the columns are user-provided as well, but the table still shows the provided columns immediately. One small limitation is data returned by onClickRow() handler. Since it returns data as an object, any duplicate keys will be lost. This is preserved from before to not break any existing APIs.
Multi-index rows are included as new columns on the left, whereas hierarchical columns ('P', 'alpha'), ('Q', 'beta')] are concatenated with _ before rendering.
A new endpoint /api/core/data-variable/${uid}/schema is responsible for column type information. It it set any time data is retrieved. An optional { schema: boolean } options field is added to useDataVariable to control whether the schema should be included in the response.
Any new dependencies Introduced
How Has This Been Tested?
Tested locally. Fixed the tests and added test for internal format conversion.
PR Checklist:
[x] I have implemented all requirements? (see JIRA, project documentation).
[x] I am not affecting someone else's work, If I am, they are included as a reviewer.
[x] I have added relevant tests (unit, integration or regression).
[x] I have added comments to all the bits that are hard to follow.
[x] I have added/updated Documentation.
[x] I have updated the appropriate changelog with a line for my changes.
Motivation and Context
Table(data=DataVariable(df))
would not show the index column when rendered.Error: ServiceError: DataFrame columns must be unique for orient='records'.
All of the above issues have been fixed. Any index columns are now automatically included in the table and pinned to the left. Both multi-index rows and hierarchical columns are also supported. The
dtype
information is propagated to the column level, automatically rendering all supported types, includingDateTime
.Other small issues fixed:
columns
field is now properly marked as optional.Implementation Description
In order to not break any existing APIs relying on
orient='records'
format, which omits the index information, the DataFrame is pre-processed internally by including a prefix__index__{i}__
or__col__{i}__
to a all columns and sent over the network.When the table is rendered, the prefixes are stripped and used as
col_id
to map to the underlying data:__col__2__name -> name
. This means thatgetData()
call for column resolution now needs to happen when the columns are user-provided as well, but the table still shows the provided columns immediately. One small limitation is data returned byonClickRow()
handler. Since it returns data as an object, any duplicate keys will be lost. This is preserved from before to not break any existing APIs.Multi-index rows are included as new columns on the left, whereas hierarchical columns
('P', 'alpha'), ('Q', 'beta')]
are concatenated with_
before rendering.A new endpoint
/api/core/data-variable/${uid}/schema
is responsible for column type information. It it set any time data is retrieved. An optional{ schema: boolean }
options field is added touseDataVariable
to control whether the schema should be included in the response.Any new dependencies Introduced
How Has This Been Tested?
Tested locally. Fixed the tests and added test for internal format conversion.
PR Checklist:
Screenshots (if appropriate):
Duplicate column names:
BEFORE: AFTER:
Multi-index rows and columns
BEFORE: AFTER: