Closed Taiilor closed 9 months ago
请检查stable-diffusion-webui/outputs/easyphoto-user-id-infos/<对应名称>/user_weights/best_outputs中是否存在文件。 如果没有文件那么我遇到了相同问题,经过小规模Lora训练测试观察到
INFO - __main__ - Running validation error, skip it.Error info: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 9.77 GiB total capacity; 5.47 GiB already allocated; 50.69 MiB free; 5.52 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF.
问题原因是在Lora训练的过程中vRAM不足,那么一个解决方案是在完成训练后额外对于每一个save steps的safetensors进行图像生成
另一种缓解方法是训练中的去除Validation
似乎是vram 不足的问题,请关闭validation选项后,再尝试一下。如果还是有问题,欢迎到有免费试用计划的云平台去试用更大的vram机器和预设的环境跑通流程。
关闭后正常运行
谢谢,关闭Validation后可以了。
Is there an existing issue for this?
Is EasyPhoto the latest version?
What happened?
同样的问题,最后生lora的时候报错了,但是翻了前辈的也没找出原因
Failed to obtain Lora after training, please check the training process.
上面的诊断书我截取了部分,如果不够全我在补上。
Steps to reproduce the problem
What should have happened?
同样的问题,最后生lora的时候报错了,但是翻了前辈的也没找出原因
Failed to obtain Lora after training, please check the training process.
上面的诊断书我截取了部分,如果不够全我在补上。
Commit where the problem happens
webui: 秋叶的1.7.0 EastPhoto: newest
System Information: OS: Microsoft Windows NT 10.0.22621.0 CPU: 16 cores Memory Size: 32768 MB Page File Size: 4136 MB
NVIDIA Management Library: NVIDIA Driver Version: 551.23 NVIDIA Management Library Version: 12.551.23
CUDA Driver: Version: 12040 Devices: 00000000:01:00.0 0: NVIDIA GeForce RTX 3060 Ti [86] 8 GB
NvApi: Version: 55123 r551_06
DirectML Driver: Devices: 9353 0: NVIDIA GeForce RTX 3060 Ti 7 GB
Intel Level Zero Driver: Not Available
What browsers do you use to access the UI ?
Mozilla Firefox
Command Line Arguments
List of enabled extensions
No
Console logs
Additional information
No