rasbt / machine-learning-notes

Collection of useful machine learning codes and snippets (originally intended for my personal use)
BSD 3-Clause "New" or "Revised" License
774 stars 138 forks source link

Error: command buffer exited with error status. #20

Open Caiguu opened 2 years ago

Caiguu commented 2 years ago

First, thanks a lot for the easy to use benchmark! It worked and helped me a lot. Here comes into a problem while using Metal. Error: command buffer exited with error status. The Metal Performance Shaders operations encoded on it may not have completed. Error: (null) Internal Error (0000000e:Internal Error) <AGXG13XFamilyCommandBuffer: 0x3c48bf500> label = <none> device = <AGXG13XDevice: 0x14517ba00> name = Apple M1 Pro commandQueue = <AGXG13XFamilyCommandQueue: 0x145181800> label = <none> device = <AGXG13XDevice: 0x14517ba00> name = Apple M1 Pro retainedReferences = 1 This problem didn't result in the break of this program. Instead, it runs well.

image

After the 46-minute-waiting, the training completed. I just wonder whether it is a bug of Apple or Metal, and if anyone has ever met this before. I searched, and I believe this is not a signle case. Thanks again!

rasbt commented 2 years ago

Huh, that's interesting. I haven't encountered that one. Actually, I see that it's been a month since you posted; hoping that recent versions fixed that.