module 'torch' has no attribute 'compiler'

zachysaur commented 1 month ago

(venv) D:\talkingface\V-Express>python inference.py --reference_image_path "./test_samples/short_case/10/ref.jpg" --audio_path "./test_samples/short_case/10/aud.mp3" --kps_path "./test_samples/short_case/10/kps.pth" --output_path "./output/short_case/talk_10_no_retarget.mp4" --retarget_strategy "no_retarget" --num_inference_steps 25 D:\talkingface\V-Express\venv\lib\site-packages\torchaudio\backend\utils.py:74: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.3.0+cu121 with CUDA 1201 (you have 2.0.1+cpu) Python 3.10.11 (you have 3.10.9) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details Traceback (most recent call last): File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\utils\import_utils.py", line 710, in _get_module return importlib.import_module("." + module_name, self.name) File "D:\talkingface\V-Express\python\lib\importlib__init.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1050, in _gcd_import File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\models\autoencoder_kl.py", line 22, in from .attention_processor import ( File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\models\attention_processor.py", line 31, in import xformers File "D:\talkingface\V-Express\venv\lib\site-packages\xformers\init__.py", line 12, in from .checkpoint import ( # noqa: E402, F401 File "D:\talkingface\V-Express\venv\lib\site-packages\xformers\checkpoint.py", line 464, in class SelectiveCheckpointWrapper(ActivationWrapper): File "D:\talkingface\V-Express\venv\lib\site-packages\xformers\checkpoint.py", line 481, in SelectiveCheckpointWrapper @torch.compiler.disable AttributeError: module 'torch' has no attribute 'compiler'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "D:\talkingface\V-Express\inference.py", line 10, in from diffusers import AutoencoderKL, DDIMScheduler File "", line 1075, in _handle_fromlist File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\utils\import_utils.py", line 701, in getattr value = getattr(module, name) File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\utils\import_utils.py", line 700, in getattr module = self._get_module(self._class_to_module[name]) File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\utils\import_utils.py", line 712, in _get_module raise RuntimeError( RuntimeError: Failed to import diffusers.models.autoencoder_kl because of the following error (look up to see its traceback): module 'torch' has no attribute 'compiler'

tiankuan93 commented 1 month ago

It looks like an incompatibility between your xformer and the torch version is causing the problem. You can try not using xformer by commenting out a few lines of inference.py.

zachysaur commented 1 month ago

(venv) D:\talkingface\V-Express>python inference.py --reference_image_path "./test_samples/short_case/tys/ref.jpg" --audio_path "./test_samples/short_case/tys/aud.mp3" --output_path "./output/short_case/talk_tys_fix_face.mp4" --retarget_strategy "fix_face" --num_inference_steps 25 D:\talkingface\V-Express\venv\lib\site-packages\torchaudio\backend\utils.py:74: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") Traceback (most recent call last): File "D:\talkingface\V-Express\inference.py", line 275, in main() File "D:\talkingface\V-Express\inference.py", line 141, in main vae = AutoencoderKL.from_pretrained(vae_path).to(dtype=dtype, device=device) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1145, in to return self._apply(convert) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply module._apply(fn) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 820, in _apply param_applied = fn(param) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1143, in convert return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\cuda__init__.py", line 239, in _lazy_init raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

(venv) D:\talkingface\V-Express>

tiankuan93 commented 1 month ago

I see in your previous log there is a line (you have 2.0.1+cpu). I think you don't have the gpu version of torch installed. you can try to set the device to cpu, as follows.

python inference.py \
    --reference_image_path "./test_samples/short_case/AOC/ref.jpg" \
    --audio_path "./test_samples/short_case/AOC/chattts.mp3" \
    --output_path "./output/short_case/talk_AOC_chattts_fix_face.mp4" \
    --retarget_strategy "fix_face" \
    --num_inference_steps 25 \
    --device "cpu"

zachysaur commented 1 month ago

i followed your instructions about pip install packages for cuda how i can install?

zachysaur commented 1 month ago

(venv) D:\talkingface\V-Express>python inference.py --reference_image_path "./test_samples/short_case/tys/ref.jpg" --audio_path "./test_samples/short_case/tys/aud.mp3" --output_path "./output/short_case/talk_tys_fix_face.mp4" --retarget_strategy "fix_face" --num_inference_steps 25 --device "cpu" Some weights of the model checkpoint at ./model_ckpts/wav2vec2-base-960h/ were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of Wav2Vec2Model were not initialized from the model checkpoint at ./model_ckpts/wav2vec2-base-960h/ and are newly initialized: ['wav2vec2.masked_spec_embed'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. D:\talkingface\V-Express\venv\lib\site-packages\diffusers\configuration_utils.py:240: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config.If you were trying to load a model, please use <class 'modules.unet_2d_condition.UNet2DConditionModel'>.load_config(...) followed by <class 'modules.unet_2d_condition.UNet2DConditionModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0. deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False) Loaded weights of Reference Net from ./model_ckpts/v-express/reference_net.pth. Loaded weights of Denoising U-Net from ./model_ckpts/v-express/denoising_unet.pth. Loaded weights of Denoising U-Net Motion Module from ./model_ckpts/v-express/motion_module.pth. Loaded weights of V-Kps Guider from ./model_ckpts/v-express/v_kps_guider.pth. Loaded weights of Audio Projection from ./model_ckpts/v-express/audio_projection.pth. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. Pipelines loaded with dtype=torch.float16 cannot run with cpu device. It is not recommended to move them to cpu as running them will fail. Please make sure to use an accelerator to run the pipeline in inference, due to the lack of support forfloat16 operations on this device in PyTorch. Please, remove the torch_dtype=torch.float16 argument, or use another device for inference. EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (512, 512) D:\talkingface\V-Express\venv\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4 Length of audio is 64512 with the sampling rate of 16000. The corresponding video length is 120. D:\talkingface\V-Express\inference.py:216: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:248.) kps_sequence = torch.tensor(torch.load(args.kps_path)) # [len, 3, 2] The original length of kps sequence is 137. The interpolated length of kps sequence is 120. D:\talkingface\V-Express\pipelines\v_express_pipeline.py:516: FutureWarning: Accessing config attribute in_channels directly via 'UNet3DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet3DConditionModel's config object instead, e.g. 'unet.config.in_channels'. num_channels_latents = self.denoising_unet.in_channels Traceback (most recent call last): File "D:\talkingface\V-Express\inference.py", line 275, in main() File "D:\talkingface\V-Express\inference.py", line 247, in main video_latents = pipeline( File "D:\talkingface\V-Express\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "D:\talkingface\V-Express\pipelines\v_express_pipeline.py", line 533, in call reference_image_latents = self.prepare_reference_latent(reference_image, height, width) File "D:\talkingface\V-Express\pipelines\v_express_pipeline.py", line 400, in prepare_reference_latent reference_image_latents = self.vae.encode(reference_image_tensor).latent_dist.mean File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\utils\accelerate_utils.py", line 46, in wrapper return method(self, *args, *kwargs) File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\models\autoencoder_kl.py", line 259, in encode h = self.encoder(x) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "D:\talkingface\V-Express\venv\lib\site-packages\diffusers\models\vae.py", line 141, in forward sample = self.conv_in(sample) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "D:\talkingface\V-Express\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: "slow_conv2d_cpu" not implemented for 'Half'

(venv) D:\talkingface\V-Express>

tiankuan93 commented 1 month ago

I'm not sure if your machine has a GPU, if it does you need to install the GPU version of the torch.
Under cpu, you also have to set the dtype to fp32. Add --dtype fp32. Mind you, the CPU will be running very, very slow.

zachysaur commented 1 month ago

atleast it will work then i will know only thing i need is pytorch not torch

zachysaur commented 1 month ago

what is gpu version of torch?

zachysaur commented 1 month ago

(venv) D:\Talking Pictures\V-Express>python inference.py --reference_image_path "./test_samples/short_case/10/ref.jpg" --audio_path "./test_samples/short_case/10/aud.mp3" --output_path "./output/short_case/talk_AOC_chattts_fix_face.mp4" --retarget_strategy "fix_face" --num_inference_steps 25 --device "cpu" --dtype fp32 D:\Talking Pictures\V-Express\venv\lib\site-packages\torchaudio\backend\utils.py:74: UserWarning: No audio backend is available. warnings.warn("No audio backend is available.") WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for: PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.0.1+cpu) Python 3.10.11 (you have 3.10.11) Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers) Memory-efficient attention, SwiGLU, sparse and more won't be available. Set XFORMERS_MORE_DETAILS=1 for more details Some weights of the model checkpoint at ./model_ckpts/wav2vec2-base-960h/ were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of Wav2Vec2Model were not initialized from the model checkpoint at ./model_ckpts/wav2vec2-base-960h/ and are newly initialized: ['wav2vec2.masked_spec_embed'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. D:\Talking Pictures\V-Express\venv\lib\site-packages\diffusers\configuration_utils.py:240: FutureWarning: It is deprecated to pass a pretrained model name or path to from_config.If you were trying to load a model, please use <class 'modules.unet_2d_condition.UNet2DConditionModel'>.load_config(...) followed by <class 'modules.unet_2d_condition.UNet2DConditionModel'>.from_config(...) instead. Otherwise, please make sure to pass a configuration dictionary instead. This functionality will be removed in v1.0.0. deprecate("config-passed-as-path", "1.0.0", deprecation_message, standard_warn=False) Loaded weights of Reference Net from ./model_ckpts/v-express/reference_net.pth. Loaded weights of Denoising U-Net from ./model_ckpts/v-express/denoising_unet.pth. Loaded weights of Denoising U-Net Motion Module from ./model_ckpts/v-express/motion_module.pth. Loaded weights of V-Kps Guider from ./model_ckpts/v-express/v_kps_guider.pth. Loaded weights of Audio Projection from ./model_ckpts/v-express/audio_projection.pth. EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0 EP Error 'providers' and 'provider_options' should be the same length if both are given. when using ['CPUExecutionProvider'] Falling back to ['CPUExecutionProvider'] and retrying. Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}} find model: ./model_ckpts/insightface_models/models\buffalo_l\w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5 set det-size: (512, 512) D:\Talking Pictures\V-Express\venv\lib\site-packages\insightface\utils\transform.py:68: FutureWarning: rcond parameter will change to the default of machine precision times max(M, N) where M and N are the input matrix dimensions. To use the future default and silence this warning we advise to pass rcond=None, to keep using the old, explicitly pass rcond=-1. P = np.linalg.lstsq(X_homo, Y)[0].T # Affine matrix. 3 x 4 Length of audio is 70656 with the sampling rate of 16000. The corresponding video length is 132. D:\Talking Pictures\V-Express\inference.py:216: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at ..\torch\csrc\utils\tensor_new.cpp:248.) kps_sequence = torch.tensor(torch.load(args.kps_path)) # [len, 3, 2] The original length of kps sequence is 137. The interpolated length of kps sequence is 132. D:\Talking Pictures\V-Express\pipelines\v_express_pipeline.py:516: FutureWarning: Accessing config attribute in_channels directly via 'UNet3DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet3DConditionModel's config object instead, e.g. 'unet.config.in_channels'. num_channels_latents = self.denoising_unet.in_channels 0%| | 0/25 [00:11<?, ?it/s] Traceback (most recent call last): File "D:\Talking Pictures\V-Express\inference.py", line 275, in main() File "D:\Talking Pictures\V-Express\inference.py", line 247, in main video_latents = pipeline( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "D:\Talking Pictures\V-Express\pipelines\v_express_pipeline.py", line 605, in call pred = self.denoising_unet( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "D:\Talking Pictures\V-Express\modules\unet_3d.py", line 496, in forward sample, res_samples = downsample_block( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "D:\Talking Pictures\V-Express\modules\unet_3d_blocks.py", line 442, in forward hidden_states = attn( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, kwargs) File "D:\Talking Pictures\V-Express\modules\transformer_3d.py", line 140, in forward hidden_states = block( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, *kwargs) File "D:\Talking Pictures\V-Express\modules\mutual_self_attention.py", line 175, in hacked_basic_transformer_inner_forward self.attn1( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(args, kwargs) File "D:\Talking Pictures\V-Express\venv\lib\site-packages\diffusers\models\attention_processor.py", line 522, in forward return self.processor( File "D:\Talking Pictures\V-Express\venv\lib\site-packages\diffusers\models\attention_processor.py", line 1231, in call hidden_states = F.scaled_dot_product_attention( RuntimeError: [enforce fail at ..\c10\core\impl\alloc_cpu.cpp:72] data. DefaultCPUAllocator: not enough memory: you tried to allocate 12884901888 bytes.

(venv) D:\Talking Pictures\V-Express>

FurkanGozukara commented 1 month ago

i made a full tutorial if you still couldn't make

works with python 3.10, cuda 11.8, venv

https://github.com/tencent-ailab/V-Express/issues/27

tiankuan93 commented 1 month ago

what is gpu version of torch?

You can find information about it at here.

nitinmukesh commented 4 weeks ago

@zachysaur

Here is completely free tutorial for Windows https://youtu.be/OFt6a2rR8GY

Let me know if you are still facing the issues

tencent-ailab / V-Express

module 'torch' has no attribute 'compiler' #24