vbalnt / tfeat

TFeat descriptor models for BMVC 2016 paper "Learning local feature descriptors with triplets and shallow convolutional neural networks"
MIT License
148 stars 45 forks source link

improve phototour getitem speed considerably #17

Closed bitsun closed 5 years ago

bitsun commented 5 years ago

the original implementation is very slow, the problem lies in the following code: self.labels==pair_ids[0] This is a linear search of randomly selected label id in a huge array, and this is done for every data item in each batch. My machine is 24 cores i7-6800k + 32 GB RAM + Nvidia 1070Ti 8GB RAM. With the old code, the training script spends most of its time fetching data, GPU usage is nearly 0%, even with 8 parallel workers in DataLoader. the processing speed is only 4.5 batch/sec. After the fix, with only 1 worker in DataLoader, it processes 27 batch/sec, and GPU usage stays steadily at 50%.

vbalnt commented 5 years ago

Thanks for the improvement! I have to admit no effort was done to optimise things in this code, so any such improvement is very welcome.