miranov25 / RootInteractive

5 stars 12 forks source link

Data volume monitoring in user interface - for #355 #358

Open miranov25 opened 5 months ago

miranov25 commented 5 months ago

In RootInteractive, the data are stored in a ColumnDataSource where columns are compressed in memory using both lossy and lossless compression techniques. This typically results in a reduction factor of about 10 between the original and compressed representation, depending on the configuration used.

Only the columns that are actively used at any given moment (in ND groupby operations (mean,median, entries, fits) , custom functions) should be expanded and cached. Currently, columns used in widgets are also expanded, but this behavior will change soon after finishing #355.

For the user interface, we can implement a strategy for column caching (NCache) and monitor current memory usage via console output. This approach allows users to control the balance between memory and CPU usage.

miranov25 commented 5 months ago

fast prototype by GPT

Description

This Bokeh application demonstrates how to create a tab that displays memory usage based on the size of expanded arrays. The memory usage is calculated and displayed in a Div widget. The text in the Div can be updated by either clicking a button or double-clicking on the tab. This setup is useful for monitoring and managing memory usage in interactive data visualizations where columns of data might be compressed and expanded dynamically.

from bokeh.io import curdoc
from bokeh.layouts import column, row
from bokeh.models import Div, Tabs, Panel, Button, CustomJS, ColumnDataSource
import random

# Create a ColumnDataSource with some example data
source = ColumnDataSource(data=dict(
    array1=[random.random() for _ in range(100)],
    array2=[random.random() for _ in range(100)]
))

# Function to calculate memory usage based on expanded arrays
def calculate_memory_usage():
    # Simulate memory calculation (replace with actual logic)
    memory_usage = sum(len(source.data[col]) for col in source.data)
    return f"Memory Usage: {memory_usage * 8 / 1024:.2f} KB"  # Assuming 8 bytes per entry

# Create a Div for displaying the memory usage
memory_usage_div = Div(text="Memory Usage: 0 KB")

# Update function
def update_memory_usage():
    memory_usage_div.text = calculate_memory_usage()

# Create a button to manually update the memory usage
update_button = Button(label="Update Memory Usage")
update_button.on_click(lambda: update_memory_usage())

# Create a callback for double-click events
double_click_callback = CustomJS(args=dict(div=memory_usage_div), code="""
    div.text = "Memory Usage: " + (Object.keys(source.data).reduce((acc, key) => acc + source.data[key].length, 0) * 8 / 1024).toFixed(2) + " KB";
""")

# Set up the layout with tabs
layout = column(memory_usage_div, update_button)
tab = Panel(child=layout, title="Memory Usage")

tabs = Tabs(tabs=[tab])

# Attach the double-click event to the tab
tabs.js_on_event('doubletap', double_click_callback)

# Add the tabs to the current document
curdoc().add_root(tabs)

# Initial update of memory usage
update_memory_usage()