baler-collaboration / baler

Repository of Baler, a machine learning based data compression tool
https://github.com/baler-collaboration/baler.github.io
Apache License 2.0
30 stars 26 forks source link

Visualize cProfile logs and dumps #331

Open sanam2405 opened 10 months ago

sanam2405 commented 10 months ago

Description of the changes made

cProfile is a built-in Python module designed for lightweight and efficient profiling of Python programs. It provides a way to analyze code performance by collecting statistics about function calls, including the number of calls, total time spent, and time per call. Using cProfile, we can pinpoint bottlenecks, identify areas for optimization, and gain insights into the runtime behavior of their Python applications. The module is particularly useful for profiling and understanding the execution flow of functions within a program, aiding in the process of code optimization and performance improvement. In this pull request, cProfile is utilized to gather profiling data, and the results are visualized using tools like SnakeViz and gprof2dot for enhanced analysis and interpretation.

SnakeViz

SnakeViz is a web-based interactive viewer for cProfile or profile (Python built-in module) output.

Profiling:

Visualization Plots:

Definition:

def visualize_profiling_data(profiling_path_prof, PORT=8998):
    """
    Visualize cProfile profiling data using SnakeViz.

    Args:
        profiling_path_prof (str): The path to the cProfile profiling data in `.prof` format.
        PORT (int): The port at which SnakeViz server runs (default is 8998), can be configured to any open port.

    Returns:
        Void. The SnakeViz starts running at the `PORT` showing the icicle and sunburst plots
    """

Exiting the SnakeViz Server:

Sample Icicle generated by SnakeViz (Trained baler for 2000 epochs)

Screenshot from 2023-11-13 03-49-54

yelp-gprof2dot

Directed Graphs (Di Graphs):

Call Graphs:

Usage:

Definition

def generate_call_graphs(func, profiling_path_pstats, output_path):
    """
    Generate call graphs and directed graphs (digraphs) for a given Python function.

    Args:
      func (callable): The Python function for which call graphs will be generated.
      profiling_path_pstats (str): The path to the profiling data in pstats format.
      output_path (str): The directory where the generated graphs will be saved.

    Returns:
     Void. The call graphs are created and saved in the `output_path` directory

    Note:
    - This function requires Graphviz to be installed and configured separately.
    - Ensure that the 'dot' executable from Graphviz is in the system's PATH.

    """

Install Graphviz from here

Sample call graph generated by yelp-gprof2dot (Trained baler for 2000 epochs)

perform_training

Replicate/Review the changes

  1. Install Graphviz from here
  2. Train baler with the --cProfile flag
poetry run baler --project CFD_workspace CFD_project_animation --mode train --cProfile

The profiled outputs and plots will be stored at the location: workspaces/CFD_workspace/CFD_project_animation/output/profiling/

  1. Use Keyboard Interrupt to exit