deepfakes / faceswap

Deepfakes Software For All
https://www.faceswap.dev
GNU General Public License v3.0
49.94k stars 13.01k forks source link

exception after few hours of training with villain #1382

Closed eassa closed 3 months ago

eassa commented 3 months ago

i am using ASUS TUF Gaming Radeon™ RX 7900 XT OC Edition 20GB GDDR6 windows 10 i followed these instruction https://forum.faceswap.dev/app.php/faqpage?sid=47859b5acaac6c66cf49a85c70d6b1bd#f1r1 https://forum.faceswap.dev/viewtopic.php?t=20

DirectML installation

while training in villain with batch size of 20 , i am getting this error after few hours of training , i have been getting this error multiple time already :

2024-04-08 03:15:23.769735: F tensorflow/c/logging.cc:43] HRESULT failed with 0x887a0005: chunk->resource->Map(0, nullptr, &upload_heap_data) 2024-04-08 03:15:23.769986: F tensorflow/c/logging.cc:43] HRESULT failed with 0x887

eassa commented 3 months ago

i always get the exception noted in the issue , but this time after 10 hours of training i got this exception as well 2024-04-08 13:32:47.839589: F tensorflow/c/logging.cc:43] HRESULT failed with 0x887a0001: dmldevice->GetDeviceRemovedReason()

torzdf commented 3 months ago

Unfortunately this issue is upstream from us and comes from a timeout within DirectML. See below for more information and potential mitigation steps:

https://forum.faceswap.dev/viewtopic.php?t=2567