Difference results with use_multiprocessing=True and use_multiprocessing=False

matterport / Mask_RCNN

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Other

24.64k stars 11.7k forks source link

Difference results with use_multiprocessing=True and use_multiprocessing=False #2352

Open Nelschka opened 4 years ago

Nelschka commented 4 years ago

Hi, I am training on my own data set and I found out that my loss-values are changing depending on using multiprocessing. My training results are much better, if I implement use_multiprocessing= False

I am confused why and I dont understand it.

I am training on a platform solution of a company, not Amazon/Google Cloud

Could someone give me any suggestions why this is happening?

I am very glad to hear from you :) Thanks in advance :)

AndreyStille commented 3 years ago

Hi, I am training on my own data set and I found out that my loss-values are changing depending on using multiprocessing. My training results are much better, if I implement use_multiprocessing= False

I am confused why and I dont understand it.

I am training on a platform solution of a company, not Amazon/Google Cloud

Could someone give me any suggestions why this is happening?

I am very glad to hear from you :) Thanks in advance :)

Hi. Have you found the answer?

maxw1489 commented 1 year ago

It took me ages to verify why this happens. In the end, I solved the problem by refactoring the whole repo and the generator so it can run in TF 2 without referring to tf.compat.v1. In the nutshell, the problem is due to false serialization of the generator when turning the use_multiprocessing on. However, there is more than that. You can try out my implementation and check the differences in the generator yourself if you like.