Taeyoung96 / Yolo-to-COCO-format-converter

Yolo to COCO annotation format converter
MIT License
280 stars 92 forks source link

Great speed-up by not loading images with opencv #12

Closed pabsan-0 closed 2 years ago

pabsan-0 commented 2 years ago

Hello!

We have noticed that the conversion runs slow when there's lots of images because each of them is being loaded into memory at once. We have substituted the cv2.imread() then .shape for the much faster imagesize.get method (see imagesize here).

Small reproducible example:

$ ## Imagesize (ours)
$ git checkout imagesize
$ time python3 main.py --path tutorial/train --output test
Start!
Processing 74 ...Finished!

real    0m0.092s
user    0m0.128s
sys 0m0.220s

$ ## cv2.imread
$ git checkout master
$ time python3 main.py --path tutorial/train --output test
Start!
Processing 0 ...
Finished!

real    0m1.395s
user    0m1.361s
sys 0m0.265s

Cheers.

Taeyoung96 commented 2 years ago

Hi, Thanks for your PR!
I didn't know that there is a great package to reduce time.

If what I understand is right, just install Imagesize package, and used only when loading images.

This repository loads image with cv2.imread() on -debug option.
Did you check if this option affects?

I'd appreciate it if you could check if there's any problem.

Thanks,

pabsan-0 commented 2 years ago

Hello!

Yes, with pip3 install imagesize is enough :)

Imagesize is just for cheaply checking image dimensions. The --debug option needs to load images, so we left it as is (I just tested it and it works as usual).

Note that --debug works only when --path points to a text file. I used this to repair the hardcoded paths externally:

$ cd tutorial 
$ ls train | sed "s:^:$PWD/train/:" > new_train.txt

Cheers :D

Taeyoung96 commented 2 years ago

Thanks for the unit test!
I merge this PR.
Thanks, :+1: