BlueBrain / neurodamus

A BBP Simulation Control application for NEURON
https://neurodamus.readthedocs.io
Apache License 2.0
9 stars 8 forks source link

CoreNeuron Rebalancer #206

Closed ferdonline closed 2 weeks ago

ferdonline commented 1 month ago

Context

CoreNeuron runs, in particular those using multi-cycle, may end up distributing cells unoptimally.

We want to mitigate that by introducing a post processing step which distributes the CoreNeuron input files evenly across ranks Machines, based on their size.

Scope

Added rebalance-corenrn-data.py and rebalance-stats.py to neurodamus/tools.

CLI:

neurodamus/tools $ ./rebalance-corenrn-data.py -h
usage: rebalance-corenrn-data.py [-h] [--ranks_per_machine RANKS_PER_MACHINE] [--max-entries MAX_ENTRIES] [--output-file OUTPUT_FILE] [-v] [--histogram]
                                 input_file n_machines

Redistribute CoreNeuron dat files, optimizing for a given number of Machines

positional arguments:
  input_file            Path to the CoreNeuron input file, typically files.dat
  n_machines            Number of target machines

options:
  -h, --help            show this help message and exit
  --ranks_per_machine RANKS_PER_MACHINE
                        Number of target ranks
  --max-entries MAX_ENTRIES
                        Consider only the first N entries of the input file
  --output-file OUTPUT_FILE
                        The rebalanced output file path
  -v, --verbose         Enable verbose output for debugging.
  --histogram           Additionally display the histogram of the ranks accumulated sizes

Testing

Using /gpfs/bbp.cscs.ch/data/scratch/proj134/home/king/BBPP134-917/o_MultiCycle_Support/output/2557873/coreneuron_input/files.dat for testing

Example

leite@bbpv2 ~/dev/neurodamus/neurodamus-py/tools (leite/corenrn-rebalancer %)$ python rebalance-corenrn-data.py /gpfs/bbp.cscs.ch/data/scratch/proj134/home/king/BBPP134-917/o_MultiCycle_Support/output/2557873/coreneuron_input/files.dat 10 --max-entries=1000
INFO :: Reading from input file: /gpfs/bbp.cscs.ch/data/scratch/proj134/home/king/BBPP134-917/o_MultiCycle_Support/output/2557873/coreneuron_input/files.dat'
WARNING :: files.dat (line 2): reduced number of entries: 1000'
INFO :: Distributing files into 10 buckets...'
INFO :: Processing 1000 entries'
         0 [  0%]
        20 [  2%]
...
       980 [ 98%]
INFO :: Writing out data from 10 buckets to file: rebalanced-files.dat'
INFO :: DONE'

Review

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236461 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236462 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236465 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236467 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236469 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236471 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236741 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #236748 (:white_check_mark:) have been uploaded here!

Status and direct links:

bbpbuildbot commented 1 month ago

Logfiles from GitLab pipeline #237055 (:white_check_mark:) have been uploaded here!

Status and direct links: