Spenhouet / tensorboard-aggregator

Aggregate multiple tensorboard runs to new summary or csv files
MIT License
166 stars 27 forks source link
aggregator csv-export summarizer tensorboard tensorflow

tensorboard-aggregator

This project contains an easy to use method to aggregate multiple tensorboard runs. The max, min, mean, median, standard deviation and variance of the scalars from multiple runs is saved either as new tensorboard summary or as .csv table.

There is a similar tool which uses pytorch to output the tensorboard summary: TensorBoard Reducer

Feature Overview

Setup and run configuration

  1. Download or clone repository files to your computer
  2. Go into repository folder
  3. Install requirements: pip3 install -r requirements.txt --upgrade
  4. You can now run the aggregation with: python aggregator.py

Parameters

Parameter Default Description
--path optional current working directory Path to folder containing runs
--subpaths optional ['.'] List of all subpaths
--output optional summary Possible values: summary, csv

Recommendation

Explanation

Example folder structure:

.
├── ...
├── test_param_xy      # Folder containing the runs for aggregation
│   ├── run_1          # Folder containing tensorboard files of one run
│   │   ├── test       # Subpath containing one tensorboard file
│   │   │   └── events.out.tfevents. ...
│   │   └── train   
│   │       └── events.out.tfevents. ...
│   ├── run_2
│   ├── ...
│   └── run_X
└── ...

The folder test_param_xy will be the base path (cd test_param_xy). The tensorboard summaries for the aggregation will be created by calling the aggregate script (containing: python static/path/to/aggregator.py --subpaths ['test', 'train'] --output summary)

The base folder contains multiple subfolders. Each subfolder contains the tensorboard files of different runs for the same model and configuration as all other subfolders.

The resulting folder structure for summary looks like this:

.
├── ...
├── test_param_xy
│   ├── ...
│   └── aggregate
│       ├── test
│       │   ├── max
│       │   │   └── test_param_xy 
│       │   │       └── events.out.tfevents. ...
│       │   ├── min
│       │   ├── mean
│       │   ├── median
│       │   └── std    
│       └── train
└── ...

Multiple aggregate summaries can be put together in one directory. Since the original base folder name is kept as subfolder to the aggregate function folder the summaries are distinguishable within tensorboard.

.
├── ...
├── max
│   ├── test_param_x
│   ├── test_param_y
│   ├── test_param_z
│   └── test_param_v 
├── min
├── mean
├── median
└── std   

The .csv table files for the aggregation will be created by calling the aggregate script (containing: python static/path/to/aggregator.py --subpaths ['test', 'train'] --output csv)

The resulting folder structure for summary looks like this:

.
├── ...
├── test_param_xy
│   ├── ...
│   └── aggregate
│       ├── test
│       │   ├── max_test_param_xy.csv
│       │   ├── min_test_param_xy.csv
│       │   ├── mean_test_param_xy.csv
│       │   ├── median_test_param_xy.csv
│       │   └── std_test_param_xy.csv
│       └── train
└── ...

The .csv files are primarily for latex plots.

Limitations

Contributions

If there are potential problems (bugs, incompatibilities to newer library versions or to a OS) or feature requests, please create an GitHub issue here.

Dependencies are managed using pip-tools. Just add new dependencies to requirements.in and generate a new requirements.txt using pip-compile in the command line.

License

MIT License