Increased dataset memory usage after new version

VicenteAlex commented 2 years ago

I was using the DVS-CIFAR10 dataset with previous versions of spikingjelly, after updating to the newest one I find that the storage for the dataset folder increased more than three times.

I believe a major reason could be that before, when using the dataset as 'frames', only the "events" and "frames" folders where saved, where now an extra "events_np" folder is saved with what I believe is the events in numpy format. Is there a reason for this new behaviour? or any way around it?

I have also realised that before you supported the usage of np.savez_compressed while now the npz is saved using np.savez (I do not know if that makes much difference)

Thank you for your time and your code, Alex

fangwei123456 commented 2 years ago

now an extra "events_np" folder is saved with what I believe is the events in numpy format. Is there a reason for this new behaviour

The old SJ only supports frames data. Now SJ supports for both events and frames. Events are saved in events_np. The frame datasets are created by events from events_np in numpy format. So, when you create a new frame dataset (e.g., you set a new frame_number), events_np will be used.

Compared with the origin binary events, the numpy format has the advantage on parallelization provided by numpy.

SJ will check whether the events_np directory exists, but not chech files inside it. So, you can delete all files in events_np, but do not delete this directory, as long as:

you will not use the events dataset;
you have created a frames dataset;
you will not create a new frames dataset.

I have also realised that before you supported the usage of np.savez_compressed while now the npz is saved using np.savez (I do not know if that makes much difference)

savez_compressed can compress files, while savez does not compress. I forgot this. I will update codes and use savez_compressed. Thanks for your advice.

fangwei123456 commented 2 years ago

For CIFAR10-DVS:

savez: events_np: 30.5 GB frames_number_4_split_by_number: 9.76 GB

savez_compressed: events_np: 9.74 GB frames_number_4_split_by_number: 589 MB

VicenteAlex commented 2 years ago

Thank you! That is a great improvement.

Also the workaround with the folders will be useful

fangwei123456 / spikingjelly

Increased dataset memory usage after new version #149