While I am using the above code to pack a group of data for training, I found it to be time-expensive: concatenating 500 samples to get a group of array (before storing in *.npz) will cost around 58 mins.
Analysis
The issue above is caused by the concatenate within a for loop. As the group array becomes bigger and bigger, it will take more and more time to concatenate it with a new sample array. It is like an $O(n^2)$ algorithm.
Instead, whenever encountering this kind of "concatenate step by step" situation, a great strategy is to store all the sample first, and concatenate it after the for loop. This is more like an $O(n)$ algorithm.
Modifications
After my update, the process of loading a group of data is like:
read the npz file of a sample;
append the sample into a list;
repeat step 12 until reading all samples;
using the list of samples to concatenate.
Experimental Results
In my server, the time for loading a group of 500 2D slices (before storing in *.npz) decreases from 58 mins to less than 2 mins.
Problem Description
In the
add_np_data
function ofload_data_for_cine_ME.py
, the data is loaded by:123
.While I am using the above code to pack a group of data for training, I found it to be time-expensive: concatenating 500 samples to get a group of array (before storing in
*.npz
) will cost around 58 mins.Analysis
The issue above is caused by the concatenate within a
for loop
. As the group array becomes bigger and bigger, it will take more and more time to concatenate it with a new sample array. It is like an $O(n^2)
$ algorithm.Instead, whenever encountering this kind of "concatenate step by step" situation, a great strategy is to store all the sample first, and concatenate it after the
for loop
. This is more like an $O(n)
$ algorithm.Modifications
After my update, the process of loading a group of data is like:
12
until reading all samples;Experimental Results
In my server, the time for loading a group of 500 2D slices (before storing in
*.npz
) decreases from 58 mins to less than 2 mins.Others
Hope it helps.