albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
https://albumentations.ai
MIT License
13.71k stars 1.61k forks source link

Albumentations blocks CPU by spawning multiple threads #1246

Open Levin-Kobelke opened 1 year ago

Levin-Kobelke commented 1 year ago

🐛 Bug/Unexpected behavior

First of all, I am not sure if this is considered a bug, but since it caused trouble for me and I could not find any easy help, I want to leave a hint for others that run into the same problem.

Problem: When using albumentations as transforms in a data loader, each data loading worker spawns additional threads. When training on a multi GPU Node this causes the shared CPU to be clogged by the albumentation processes and takes up to much CPU for only using part of the GPU.

To Reproduce

Steps to reproduce the behavior:

  1. Build a dataloader with albumentation transforms if self.transform is not None: res = self.transform(image=image)
  2. Use multiple workers to load the data onto the GPU
  3. Observe each worker spawn multiple threads to perform augmentations (e.g. $htop ->Tree view)

Expected behavior

Some control parameter or hint of this behavior.

Environment

Solution

Set cv2.setNumThreads(0) to fix the number of threads spawned

MeteorsHub commented 1 year ago

I observed the same problem. It consumed most of my cpu usage and the training is very slow. The cv2 trick worked for me

zakajd commented 1 year ago

+1 here. Same problem fixed with cv2.setNumThreads(0). I didn't notice such behaviour when using Albumentations 6 month ago (~June-July). Did something changes this then?

Dipet commented 1 year ago

This is known issue with OpenCV: https://github.com/albumentations-team/albumentations#comments I can not imagine common fix.