Closed martindellavecchia closed 3 months ago
That error for tensorflow looks familiar. What CPU are you using, and do you know if it supports AVX?
That error for tensorflow looks familiar. What CPU are you using, and do you know if it supports AVX?
I am using a former mining rig I had, It's a pentium gold G5420 CPU, with 20GB of RAM. - i google'd and it does not support it.
wierd thing is, A1111 works super fine.
Now I am running a different issue:
`22:11:42-066006 INFO Command executed.
Traceback (most recent call last):
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1390, in _get_module
return importlib.import_module("." + module_name, self.name)
File "C:\Program Files\Python310\lib\importlib__init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 710, in _get_module
return importlib.import_module("." + module_name, self.name)
File "C:\Program Files\Python310\lib\importlib__init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\Martin\Desktop\kohya_ss\sd-scripts\train_network.py", line 21, in
I created a rig for multi GPU LORA training, but it never worked, kohya crashed prior starting the training. I wipped out my entire system, as i thought i was an OS or python issue, so I reinstalled just the OS with kohya and the latest nvidia drivers.I also disabled the rest of the gpus, to make sure they are not the cause of the issue, leaving just the 2080ti, but I keep receiving this error:
`19:27:16-339951 INFO Command executed. [2024-06-09 19:27:21,852] torch.distributed.elastic.multiprocessing.redirects: [WARNING] NOTE: Redirects are currently not supported in Windows or MacOs. [W socket.cpp:663] [c10d] The client socket has failed to connect to [DESKTOP-413GD2B]:12345 (system error: 10049 - La direcci¾n solicitada no es vßlida en este contexto.). Traceback (most recent call last): Traceback (most recent call last): File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1390, in _get_module File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1390, in _get_module return importlib.import_module("." + module_name, self.name) return importlib.import_module("." + module_name, self.name) File "C:\Program Files\Python310\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "", line 241, in _call_with_frames_removed
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\models\clip\image_processing_clip.py", line 21, in
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\models\clip\image_processing_clip.py", line 21, in
from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_size_dict
from ...image_processing_utils import BaseImageProcessor, BatchFeature, get_size_dict File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\image_processing_utils.py", line 28, in
from .image_transforms import center_crop, normalize, rescale from .image_transforms import center_crop, normalize, rescale File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\image_transforms.py", line 47, in
import tensorflow as tf import tensorflow as tf File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\tensorflow__init__.py", line 42, in
from tensorflow.python import tf2 as _tf2 from tensorflow.python import tf2 as _tf2
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\tensorflow\python\tf2.py", line 21, in
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\tensorflow\python\tf2.py", line 21, in
from tensorflow.python.platform import _pywrap_tf2from tensorflow.python.platform import _pywrap_tf2
ImportErrorImportError: DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL).: DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL). The above exception was the direct cause of the following exception:
Traceback (most recent call last):
The above exception was the direct cause of the following exception:
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 710, in _get_module Traceback (most recent call last): File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 710, in _get_module return importlib.import_module("." + module_name, self.name) return importlib.import_module("." + module_name, self.name) File "C:\Program Files\Python310\lib\importlib__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level) return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1050, in _gcd_import
File "", line 1006, in _find_and_load_unlocked
File "", line 1027, in _find_and_load
File "", line 688, in _load_unlocked
File "", line 1006, in _find_and_load_unlocked
File "", line 883, in exec_module
File "", line 688, in _load_unlocked
File "", line 241, in _call_with_frames_removed
File "", line 883, in exec_module
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 20, in
File "", line 241, in _call_with_frames_removed
from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\pipelines\stable_diffusion\pipeline_stable_diffusion.py", line 20, in
from transformers import CLIPImageProcessor, CLIPTextModel, CLIPTokenizer, CLIPVisionModelWithProjection File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1381, in getattr File "", line 1075, in _handle_fromlist
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1381, in getattr
value = getattr(module, name)
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1380, in getattr
value = getattr(module, name) module = self._get_module(self._class_to_module[name])
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1392, in _get_module File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1380, in getattr raise RuntimeError( RuntimeError : Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback): DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL).module = self._get_module(self._class_to_module[name])
The above exception was the direct cause of the following exception:
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\transformers\utils\import_utils.py", line 1392, in _get_module Traceback (most recent call last): File "C:\Users\Martin\Desktop\kohya_ss\sd-scripts\train_network.py", line 21, in
from library import deepspeed_utils, model_util
raise RuntimeError( File "C:\Users\Martin\Desktop\kohya_ss\sd-scripts\library\model_util.py", line 13, in
: File "", line 1075, in _handle_fromlist
Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback):
DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL). File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 701, in getattr
The above exception was the direct cause of the following exception:
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 701, in getattr Traceback (most recent call last): File "C:\Users\Martin\Desktop\kohya_ss\sd-scripts\train_network.py", line 21, in
value = getattr(module, name) from library import deepspeed_utils, model_util
File "C:\Users\Martin\Desktop\kohya_ss\sd-scripts\library\model_util.py", line 13, in
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 700, in getattr
from diffusers import AutoencoderKL, DDIMScheduler, StableDiffusionPipeline # , UNet2DConditionModel
File "", line 1075, in _handle_fromlist
module = self._get_module(self._class_to_module[name])
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 712, in _get_module
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 701, in getattr
raise RuntimeError(
RuntimeErrorvalue = getattr(module, name):
Failed to import diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion because of the following error (look up to see its traceback):
Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback):
DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL). File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 701, in getattr
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 700, in getattr module = self._get_module(self._class_to_module[name]) File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\diffusers\utils\import_utils.py", line 712, in _get_module raise RuntimeError( RuntimeError: Failed to import diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion because of the following error (look up to see its traceback): Failed to import transformers.models.clip.image_processing_clip because of the following error (look up to see its traceback): DLL load failed while importing _pywrap_tf2: Error en una rutina de inicialización de biblioteca de vínculos dinámicos (DLL). [2024-06-09 19:27:43,961] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 6468) of binary: C:\Users\Martin\Desktop\kohya_ss\venv\Scripts\python.exe Traceback (most recent call last): File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Martin\Desktop\kohya_ss\venv\Scripts\accelerate.EXE__main__.py", line 7, in
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1008, in launch_command
multi_gpu_launcher(args)
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 666, in multi_gpu_launcher
distrib_run.run(args)
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\torch\distributed\run.py", line 797, in run
elastic_launch(
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "C:\Users\Martin\Desktop\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
C:/Users/Martin/Desktop/kohya_ss/sd-scripts/train_network.py FAILED
Failures: [1]: time : 2024-06-09_19:27:43 host : DESKTOP-413GD2B rank : 1 (local_rank: 1) exitcode : 1 (pid: 6220) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
Root Cause (first observed failure): [0]: time : 2024-06-09_19:27:43 host : DESKTOP-413GD2B rank : 0 (local_rank: 0) exitcode : 1 (pid: 6468) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
19:27:44-977402 INFO Training has ended.`
Any help will be greatly appreaciated.