allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.42k stars 643 forks source link

Hierarchical/subfolder support for organizing figures in Plots tab #1253

Open javier-coronel-snkeos opened 2 months ago

javier-coronel-snkeos commented 2 months ago

Proposal Summary

Hi, the option to simply add plotly figures to the "Plots" tab is great! However, I'm encountering it difficult to organize the figures, especially when having a really big number of them.

For some of my experiments I'm saving a long list of figures to subfolders depending on the type of data desired to plot. When plotting those figures in ClearML, they appear on the "same level" in the "Plots" tab, which makes it very difficult to navigate or filter them. At the moment, I'm using as plot title what it would be the path where to save the figure. The "Toggle Graphs" feature helps to some extent, but it gets really difficult when I have multiple levels of folders or long figure names.

Motivation

It would be nice to have an option to have some type of hierarchy, grouping, or subfolder structure in the "Plots" tab to improve organization and navigation, avoiding clutter and allowing an easier way of filtering.

Here's a code snippet to produce an example similar to my case. A list of plots that would be saved to subfolders depending on the data they show.

import random
import plotly.graph_objs as go
from clearml import Task

def create_random_plots(task, num_plots):
    for i in range(num_plots):
        x_data = [random.randint(1, 10) for _ in range(10)]
        y_data = [random.randint(1, 10) for _ in range(10)]

        scatter_plot = go.Figure(go.Scatter(x=x_data, y=y_data, mode="markers"))
        task.get_logger().report_plotly(
            title=f"folder_1/scatter_plot/my_figure_very_long_name_{i+1}", # this would be the path where to save my figure
            series="Random Scatter Plots",
            figure=scatter_plot)
        bar_plot = go.Figure(go.Bar(x=x_data, y=y_data))
        task.get_logger().report_plotly(
            title=f"folder_1/bar_plot/my_figure_long_name_{i+1}",
            series="Random Bar Plots",
            figure=bar_plot)
        z_data = [[random.randint(1, 10) for _ in range(10)] for _ in range(10)]
        heatmap_plot = go.Figure(go.Heatmap(z=z_data))
        task.get_logger().report_plotly(
            title=f"folder_2/heatmap_plot/my_figure_long_name_{i+1}",
            series="Random Heatmap Plots",
            figure=heatmap_plot)

if __name__ == "__main__":
    task = Task.init(project_name="my_project", task_name="my_experiment")
    num_plots = 10
    create_random_plots(task, num_plots)

This code results in a series of plots that look like these: image Now imagine it's not 10 plots per folder but 20, and many more folders. In that case, I'll have to individually click all the plots of interest in the "toggle graphs".

As potential solutions, it would be nice to introduce a folder-like structure (similar to dataset/experiments projects) to reflect the organization of plots in subfolders. Another way could be allowing a bulk grouping mechanism based on the name of the figure.

Thank you for considering these suggestions and I'm looking forward to your comments.