mdbloice / Augmentor

Image augmentation library in Python for machine learning.
https://augmentor.readthedocs.io/en/stable
MIT License
5.08k stars 866 forks source link

which is faster? [sample(10) for 10 times] or [sample(100)]? #162

Closed Charles-Lu closed 5 years ago

Charles-Lu commented 5 years ago

Hi @mdbloice ,

I'm somewhat curious about the relation between speed and sample size. Let's say we need to generate 100 images, will sample(10) for 10 times or sample(100) be faster? If the latter is faster, I assume the best strategy is to set the sample size as large as the memory can handle?

Thank you very much!

mdbloice commented 5 years ago

Hi @Charles-Lu - I actually I am not sure what would be faster, however, I think it would be sample(100) as there is some overhead when using Python's multithreading stuff. Therefore calling sample() once rather than 10 times would probably be faster. Memory shouldn't be an issue when calling sample() with large numbers however, unless you are using Augmentor's generators I suppose. M.

Charles-Lu commented 5 years ago

I see. Thank you very much!