HuiZhang0812 / DiffusionAD

148 stars 16 forks source link

Training is so slow #34

Closed Drm-hello closed 8 months ago

Drm-hello commented 8 months ago

One epoch takes close to four hours

i'm using RTX4080 ro start training with vsia dataset,set the batch-size to 8. Any idea to improve it?

HuiZhang0812 commented 8 months ago

Generally, an epoch takes less than one minute. You can use cProfile (https://docs.python.org/3/library/profile.html) to see program bottlenecks. By the way, what is the worker number you set?

Drm-hello commented 8 months ago

for training data loader,it set to 0 with only one GPU. when it set to original 8,i got this error

Traceback (most recent call last): File "E:\diffusionAD)DiffusionAD-mainldataldataset beta thresh.py", line 522, in --getitem_ augmented_image, anomaly_mask, has_anomaly = self.perlin_synthetic(image, thresh, anomaly-path, cv2_image File "E: diffusionADlDiffusionAD-mainldataldataset beta thresh.py", line 473, in perlin-synthetic anomaly_image = aug(image=cv2_image) File"p:\ProgramData Anaconda3 envs DiffusionAD\lib\site-pac Tine 2008,in __call return self.augment(*args,**kwargs) File "p:\ProgramData Anaconda3 envs DiffusionAD lib site-packages line 1979,in augment batch_aug = self.augmentbatch(batch, hooks=hooks) ile"p:\ProgramData Anaconda3 envs DiffusionAD lib site-package 641. in augment_batch ine batch_inaug = self._augmentbatch( File"p:\ProgramDatalAnaconda3\envs DiffusionAD\lib\site 3124 in _augment_batch line batch = self[index].augmentbatch( File "D:\ProgramData Anaconda3 envs DiffusionAD\lib) line 641, in augment_batch batchinaug = self. augment batch( File "p:\ProgramData Anaconda3\envs DiffusionAD\lib\sit line 2472, in _augment_batch. ugmenters\color .py image_hsv = self._transform_image_cv2( ile"p:\ProgramData Anaconda3\envs DiffusionAD\lib\site augmenters\color.py", line 2503,in transform_image_cv2 table_saturation = CIS._LUT_CACHE[1] TypeError:'NoneType' object is not subscriptable

HuiZhang0812 commented 8 months ago

The number of num_workers should be set to 4 or more. The issue you encountered seems to come from "line 473, in perlin-synthetic, anomaly_image = aug(image=cv2_image)". Please check if there is a bug.