FabienArcellier / writer-framework

No-code in the front, Python in the back. An open-source framework for creating data apps.
https://streamsync.cloud
Apache License 2.0
0 stars 1 forks source link

design propositions for dataframe editor #63

Open FabienArcellier opened 3 months ago

FabienArcellier commented 3 months ago

This ticket specifies several proposals for rewrithing DataFrame component. I was inspired by Mikita's work on the AI ​​part for these proposals.

User interface

[OK] - P1. Adding a record to a dataframe editor is done using a button that appears on hover at the bottom of the dataframe editor. The click triggers the wf-dataframe-add event.

illustration of interaction from obsidian Peek 2024-06-12 16-37

def on_record_add(state, payload):
    payload.record['sales'] = 0 # default value inside the dataframe
    state['df'].record_add(payload)

[OK] - P2. A quick action menu on the left appears on hover and allows you to delete a record / add your own actions. This menu implements delete a record as the default action. The actions would be accessible in the wf-record-action event.

illustration of interaction from monday.com Peek 2024-06-12 16-27

Here is an example of content for the "quick actions" field with 3 items in the menu.

{
    'remove': "Remove record"
    'important': "Important"
    'open': "Open record"
}
def on_record_action(state, payload):
    """
    This event corresponds to a quick action in the drop-down menu to the left of the dataframe.
    """
    if payload['action"] == 'remove':
        state['df'].record_remove(payload)
    if payload['action"] == 'important':
        state['df'].record(payload.id).update('flag', True) # update the column flag of the dataframe to true, trigger une mutation record_update
    if payload['action"] == 'open':
        state['record'] = state['df'].record(payload.id)

[OK] - P3. A user can select multiple rows of a dataframe in the dataframe editor. Selecting a record triggers the wf-dfeditor-select event.

def on_record_select(state, payload):
        state['df_selection'] = payload # [12, 13, 24, 26] 

[OK] - P4. moving records is not supported. This would be a plus but I don't know how to implement it. The ideas I have would only be compatible with a specific dataframe format. This would also prevent sorting on the columns.

[LATER] - P5. implement a data repeater working as a repeater. This part is for later (to describe). This feature would allow to customize rendering.

[DISCUSSION] - P6. the event and fields from frontend

type name default
field data must be a state reference to an editable dataframe or a supported dataframe
field show index bool False show the dataframe's index
field enable download bool False show the dataframe's index
field selection mode list no selection no selection
single
multiple selection
field actions record {"remove": "Remove"} enable quick action on a specific record
field enable record add bool True
field enable record update bool True
field enable sort bool True enable a user to modify the sort on the frontend
event wf-dataframe-add - trigger when a user add a record
event wf-dataframe-update -
event wf-dataframe-action -
event wf-dataframe-select -
# wf-dataframe-add : the event is sent when the user validate it's input, 
# if it send ESC the creation is canceled
{
  "record": {"a":1 , "b": 2}
}

# wf-dataframe-update : the event is sent when the user validate a changed
# if it send ESC the creation is canceled
{
  "record_index": 12,
  "record": {"a":1 , "b": 2}
}

# wf-dataframe-action : the event is sent when the user trigger the action
{
  "record_index": 12,
  "action": "remove"
}

# wf-dataframe-select : the event is sent when the user change the selection in the dataframe
[12, 14, 15, 24, 26]

Backend

[OK] - P7. a dataframe is re-encoded in writer. Like the AI ​​component, it allows you to have pre-existing operations to process the payloads of the editor triggered in the UI and to have your own serializer. We can use the same object for the current dataframe reader (to be challenged). The frontend component must also be compatible with a simple dataframe but in this case, the exchange protocol will be heavier. We return the dataframe at each change (see the section on the protocol).

initial_state = ss.init_state({
    "df": wf.EditableDataframe(df), # helper with operation and optimized protocol
    "df": df # raw dataframe, user has to reimplement everything (straight forward and user friendly for simple binding)
})

[OK] - P8. changes in the dataframe editor trigger events. Changes in the dataframe editor trigger events that will be processed easily because the writer.Dataframe object has methods to directly process payloads.

import writer as wf

def on_record_add(state, payload):
    payload.record['sales'] = 0 # default value inside the dataframe
    state['df'].record_add(payload)

def on_record_change(state, payload):
    state['df'].record_update(payload)

def on_record_action(state, payload):
    """
    This event corresponds to a quick action in the drop-down menu to the left of the dataframe.
    """
    if payload.action == 'remove':
        state['df'].record_remove(payload)
    if payload.action == 'important':
        state['df'].record(payload.id).update('flag', True) # update the column flag of the dataframe to true, trigger une mutation record_update
    if payload.action == 'open':
        state['record'] = state['df'].record(payload.id)

def on_record_select(state, payload):
    state['df_selected_record'] = payload['selected']

# trigger on button outside the dataframe editor
def on_save_click(state, payload):
        state['df'].df.to_csv('file.csv')

initial_state = ss.init_state({
    "df": wf.EditableDataframe(
        df
    )
})

# snippets
state['df'].df # get the real dataframe datastructure
state['df'].df = df # set a new dataframe, send the entire dataframe
state['df'].record_update(payload) # update a record using frontend payload
state['df'].record_remove(payload) # remove a record using frontend payload
state['df'].record_add(payload) # add a new record using frontend payload
state['df'].record(12) # get a record as dict

Protocol

[LATER] - P9. a double protocol at the stream level optimizes the transfer of the dataframe. This model allows you to update a dataframe efficiently during front/back exchanges by transferring mutations. This optimization improves the responsiveness of the UI. This pattern is possible thanks to the creation of wf.EditableDataframe.

// The payload of the event is quite rich and the graphic component knows how to process it.
{
    "full_df": "", // Optional, just when the page is loaded. When it's empty, nothing happen on the frontend
    "mutations": [ // Optional, just after an event, the mutations of dataframe are sent back to the frontend that manage them and update the df
        {"op": "record_remove", "id": 12345},
        {"op": "record_new", "id": 12345, "record": { ... }},
        {"op": "record_update", "id": 12345, "record": { ... }}
    ]
}

[KO] - P10: encode the record position into __index_level_0__ : pyarrow drop the support of index in the dataframe by default. We need a surogate key to manage update. There is a way to handle dataframe index that encode the index into __index_level_0__. I propose to use this attribute as row identifier.

I write code that test this possibility with 4 datastructure (pandas, polars, list of record, list of array) :

ramedina86 commented 2 months ago

Hey, this is really cool stuff.

1) The wrapped Dataframe makes sense. I like wf.EditableDataframe. I think it's ok to add some complexity for dataframe editing. I don't see why we'd require wrapping it for display.

2) To store, we'd keep a copy of the original dataframe and manipulate it? Support Pandas, Polars, etc, serialize to Apache Arrow?

3)

"df": df # raw dataframe, user has to reimplement everything (straight forward and user friendly for simple binding)

What do you mean here? Do you want to allow two-way binding?