Open FrancescoSaverioZuppichini opened 1 year ago
We will try to use ffcv
After reviewing the ffcv performance guide, I understood that:
Since, we would like to work directly on pixel_data, we will switch to tensordict
If I want to use tensordict
, I will have do to two things:
tensordict
doesn't support loading a memmap array from file, I should make a PR
[EDIT] I was wrong, there is a way on the doc
After a lot of experiments, resulting in this benchmark. I have concluded that the best tradeoff between mental sanity and development is to first implement a normal dataset loading using Dataset
but the augmentations as nn.Module
so I can send to the GPU a uint8
image and drastically improve throughput during training
The next step to try would be to
read them in the dataset by just using the idx
, still not ideal since we will not taking a slice but probably will avoid page fault
I need to develop a fast data pipeline to not be bottlenecked by loading data as it usually happens in almost all models. To archive this I need to batch everything, store the resulting vector to a file and memap it.
There are different issues that needs to be solved:
uin8
and the bboxes asint64
I could preprocess the files in rust, by creating the correct numpy format