libffcv / ffcv

FFCV: Fast Forward Computer Vision (and other ML workloads!)
https://ffcv.io
Apache License 2.0
2.81k stars 180 forks source link

run slowly in real pytorch project #179

Closed zhoupan9109 closed 2 years ago

zhoupan9109 commented 2 years ago

Following official documents and sample code, I generate the .beton file sucessfully. Efficiently loading makes me suprise, however applying the load code for real pytorch project, some unexpected results are captured in experiments.

The pytorch code of experiments as below:

experiment 1: loader = Loader('./ds.betion') for data, label in loader: if torch.cuda.is_availabel(): data = data.cuda()

experiment 2: loader = Loader('./ds.betion') for data, label in loader: if torch.cuda.is_availabel(): data = data.cuda() out = model(data)

the image pipeline: [SimpleRGBImageDecoder(). ToTensor(), ToTorchImage()]

In experiment2, the statement 'data = data.cuda()' run slowly 100x than the statement which at same place in experiment 1.

If you have some suggestions, please concat me without hesitation. Thank you very much.

GuillaumeLeclerc commented 2 years ago

Three things:

Feel free to reopen the issue if this doesn't resolve your problem

zhoupan9109 commented 2 years ago

Thanks for your reply!

Befor the experiments above, the following code has been tested.

loader = Loader('./ds.betion') for data, label in loader: out = model(data)

image pipeline = [SimpleRGBImageDecoder(). ToTensor(), ToTorchImage(), ToDevice(0)]

On this condition, the statement, 'out = model(data)', costs more time than Pytorch officical Loader code.

For constrasting the efficiency of Loader, a complete Pytorch project has been tested which applys Loader from FFCV and Pytorch respectively . Because of SOTA efficiency of FFCV, intuitvely, Loader of FFCV will reduce the time of training phase. But the experiments show opposite results as follew:

FFCV: 19s / 10 iter ; Pytorch 15s / 10 iter

Above test baesd on the conditions:

hardware: GTX1080 image resolution: 256x144 batch size: 64 num_workers: 10

For solving this unexpected results, the above experiments have been designed for locating bug.