Closed Fangyh09 closed 5 years ago
The resolution is much larger than ImageNet, where I benchmarked my code. I think perhaps in your case the challenging part is the CPU.
My loader disables write and turns off the lock https://github.com/Lyken17/Efficient-PyTorch/blob/master/tools/folder2lmdb.py#L27. I don't think the LMDB library will slow the process.
Can you share you disk I/O, and your CPU utilization during loading?
The server is used by multiple users and I have to wait for a good time. I wonder whether the postprocess will slow the speed or not.
img = Image.open(buf).convert('RGB')
Usually not for imagenet data. But for your case, I think it might be. Have you tried this repo https://github.com/uploadcare/pillow-simd?
Close for no activity for a week. Feel free to reopen it if it is necessary.
Speed is a bit slower after using lmdb. 30k images with size 1000x1000. The images are stored in SSD. Are there some locks in lmdb slow the speed?