mckinsey / vizro

Vizro is a toolkit for creating modular data visualization applications.
https://vizro.readthedocs.io/en/stable/
Apache License 2.0
2.61k stars 116 forks source link

AGGrid implementation #260

Closed maxschulz-COL closed 6 months ago

maxschulz-COL commented 8 months ago

Description

Implementation of the dash_ag_grid as alternative table function to be inserted in vm.Table

This PR is two things:

TLDR discussion

The discussion evolves around the question: where do we want implementation details of callable objects inserted into Graph and Table to lie: with the callable, or as an if distinction in the source code once the callable reaches that point. This also extends to implementation details of models such as Parameter and Filter, but is a slightly different question there.

The short answer is: of course the former (ie with the callable), but the consequences are at times a little tricky.

Base Version (https://github.com/mckinsey/vizro/pull/260/commits/96b6259cecda7fa89682c3b04d42c2f67b70c13f):

Initial implementation of AGGrid with the following key decisions:

This is for example code like

if model == `graph`:
    inputs.append(
          {
              "clickData": State(component_id=triggered_model.id, component_property="clickData"),
          }
      )
elif model == `aggrid`:
    inputs.append(...)
elif model == `dashtable`:
    ....

This is for example code that explains how to carry out actions such as filtering by clicking into a table

def apply_abc_filter_interaction():
        ....
        column = ctd_active_cell["value"]["column_id"]
        derived_viewport_data_row = ctd_active_cell["value"]["row"]
        clicked_data = ctd_derived_viewport_data["value"][derived_viewport_data_row][column]
        data_frame = data_frame[data_frame[column].isin([clicked_data])]
return data_frame

We need to determine different triggers:

    @validator("actions")
    def set_actions(cls, v, values):
        table_type = _get_table_type(values["figure"])
        if table_type == "DataTable":
            return _set_actions(v, values, "active_cell")
        elif table_type == "AgGrid":
            return _set_actions(v, values, "cellClicked")
        else:
            raise ValueError(f"Table type {table_type} not supported.")

Iteration on base (https://github.com/mckinsey/vizro/pull/260/commits/7d370c2248bb044a967b8d878dcfa54c9b17a259)

Abstraction approach (https://github.com/mckinsey/vizro/pull/260/commits/2c0e0a7a1dd002bb7e8318efcdbcd2dc08cd6f0a)

Which ultimate route do we want to go?

This means hard coding the distinction between returned components in the source code. It is the easiest solution to implement. However, any new return for callables inserted into Graph and Table (and in the future maybe vm.React?) would require us to modify the source code. How often that would occur and whether this is a problem remains to be debated. If we decide that we want to get the feature out ASAP, we should choose this approach.

This means making heavy use of function attributes and would follow a light agreement we had previously on this topic. However, making such extensive use of function attributes could also be seen as generally questionable, and it opens up the question why we are not using an object oriented approach where attributes, inheritance and other things come naturally (see next approach).

On the flip side, this looks to some extent much improved to the base version. If you follow the example above, the implementation details of the dash_data_table now lie in the same file as the dash_data_table. Creating the dash_ag_grid would have been a breeze e.g.

Currently we are following this taxonomy, but we may question whether it is the correct one. Some questions that arise:

Other open tasks

Screenshot

Notice

huong-li-nguyen commented 7 months ago

I'll take a proper look tomorrow if that's fine, but I already wanted to leave a comment saying that I love the PR description and the different iterations you've done. Amazing summary - it looks like lots of great stuff and interesting questions to ponder! 🚀 ❤️

petar-qb commented 7 months ago

First of all, Max, you’ve done excellent investigation 🚀, and thanks for that.

Let me recap my thoughts, and please correct me if I misunderstand something 🙏.

Base approach is probably good enough to support only AgGrid. The most important question here (the reason why the PR description looks like “spike issue”) is how to make vm.Table flexible enough to enable custom user underlying table implementation to work.

By setting only different vm.Table private properties (string properties e.g. _input_property(”active_cell”)) we do not achieve desired flexibility. The most basic reason is that sometimes input property has to be a dict or list of dash.States. (e.g. filter interaction for dash_table.DashTable).

One approach could be object oriented approach where we serve different implementations of _get_action_inputs (and _get_action_outputs) based on is the underlying table dash_table or ag_grid. This method could be overwritten by user in case that the underlying table is custom_user_table. Okay, but we also need to expose _filter_interaction_function so it can be overwritten too (filter_interaction functions differ based on the underlying table). The problem with object oriented solution is that we don't want to force our users to inherit vm.Table class if they need only to change the figure property, and we want to enable handling custom figures utilising the @capture("table") only instead. If you can remember any other difficulties about object oriented approach, please let me know. Currently the biggest issue with the object oriented approach is that we don’t want to inject dash dependencies in any model method that is not the build method.

Very Quick Question: Why? 😄

Okay.. Another approach is serving model_action_configuration through the function attributes. This approach has similar opportunities. There is a default configuration for predefined models and users are able to overwrite it and propagate their own custom_action_input_properties, custom_filter_interaction_function and so on. Ups and downs of using function attributes are:

Advantages:

Disadvantages:

So, If we want to enable only AgGrid - it should be fine. If we want to enable modifying how the model component would behave in the actions world then we should implement the solution on the Vizro library level (for every component: Dropdown, Button, Graph, Table). If we solve this problem generally it would mean that the custom react components, mathplotlib charts, and any custom tables will be automatically enabled to deal with actions.

Person A: “Is this implementation out of the scope of this PR?” Person B: “Yes it is, but it’s worth at least thinking about it before we merge AgGrid on the main.”

Solving component action's information agnostically on the app level is a broad topic and probably we need to exchange and align our thoughts on some PS session.

AnnMarieW commented 7 months ago

Just FYI Dash AG Grid V31 was just released. It would be good to update to the latest version here too.

https://community.plotly.com/t/dash-ag-grid-31-0-0-released-more-function-support-new-quartz-theme-cell-data-types-built-in-cell-editors-and-more/82036

maxschulz-COL commented 6 months ago

We decided to go with an approach that has multiple models per callable (ie one for Table, one for Grid). See here for implementation: https://github.com/mckinsey/vizro/pull/289

maxschulz-COL commented 6 months ago

Just FYI Dash AG Grid V31 was just released. It would be good to update to the latest version here too.

https://community.plotly.com/t/dash-ag-grid-31-0-0-released-more-function-support-new-quartz-theme-cell-data-types-built-in-cell-editors-and-more/82036

In the PR raised above I have taken over what I think are the most powerful out-of-the-box features of Dash AG Grid V31