OpenTalker / video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
https://opentalker.github.io/video-retalking/
Apache License 2.0
6.37k stars 944 forks source link

Traceback error #4

Open Tobe2d opened 1 year ago

Tobe2d commented 1 year ago

I am running into error when I try: python inference.py \ --face examples/face/1.mp4 \ --audio examples/audio/1.wav \ --outfile results/1_1.mp4

Traceback (most recent call last): File "inference.py", line 4, in <module> from PIL import Image File "C:\Users\username4\anaconda3\envs\video_retalking\lib\site-packages\PIL\Image.py", line 103, in <module> from . import _imaging as core ImportError: DLL load failed while importing _imaging: The specified module could not be found.

kunncheng commented 1 year ago

That seems to be the problem with the pillow version. I googled a picture, so you can look it up. 20201206105701259

Tobe2d commented 1 year ago

Thanks @kunncheng that helps me a lot as I had to install Pillow-8.4.0 however now I am running into another error:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
kunncheng commented 1 year ago

Sorry we haven't test the code on Windows. You can try this: https://stackoverflow.com/questions/20554074/sklearn-omp-error-15-initializing-libiomp5md-dll-but-found-mk2iomp5md-dll-a Or you can try our colab notebook: https://colab.research.google.com/github/vinthony/video-retalking/blob/main/quick_demo.ipynb

Tobe2d commented 1 year ago

Thanks for sharing this link. It did help moving one more step, however, now another step I get stuck at:

Traceback (most recent call last):
  File "inference.py", line 11, in <module>
    from third_part.face3d.extract_kp_videos import KeypointExtractor
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 6, in <module>
    import face_alignment
ModuleNotFoundError: No module named 'face_alignment'
kunncheng commented 1 year ago

Please run the environment installation commands first.

Make sure that all packages in requirements.txt are installed successfully.

Tobe2d commented 1 year ago

Thanks a lot @kunncheng for your patience

After running the pip install -r requirements.txt

Now I have this:

usage: inference.py [-h] [--DNet_path DNET_PATH] [--LNet_path LNET_PATH] [--ENet_path ENET_PATH] [--face3d_net_path FACE3D_NET_PATH] --face FACE --audio AUDIO [--exp_img EXP_IMG] [--outfile OUTFILE] [--fps FPS] [--pads PADS [PADS ...]] [--face_det_batch_size FACE_DET_BATCH_SIZE]
                    [--LNet_batch_size LNET_BATCH_SIZE] [--img_size IMG_SIZE] [--crop CROP [CROP ...]] [--box BOX [BOX ...]] [--nosmooth] [--static] [--up_face UP_FACE] [--one_shot] [--without_rl1] [--tmp_dir TMP_DIR] [--re_preprocess]
inference.py: error: unrecognized arguments: \ \ \
kunncheng commented 1 year ago

Try this: python3 inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4

\ is just used for line breaks.

Tobe2d commented 1 year ago

When I try python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4

After that it start downloading few things and when it start the process it stoped on Landmark det as you see:

(video_retalking) E:\video-retalking>python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Landmarks Extraction in Video.
landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<12:31,  5.61s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)(video_retalking) E:\video-retalking>python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Landmarks Extraction in Video.
landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<12:31,  5.61s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)

template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}

extern "C" __global__
void fused_cat_cat(float* tinput0_42, float* tinput0_46, float* tout3_67, float* tinput0_60, float* tinput0_52, float* tout3_71, float* aten_cat, float* aten_cat_1) {
{
  if (blockIdx.x<512 ? 1 : 0) {
    aten_cat_1[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<128 ? 1 : 0) ? __ldg(tinput0_60 + (512 * blockIdx.x + threadIdx.x) % 262144) : __ldg(tinput0_52 + (512 * blockIdx.x + threadIdx.x) % 262144 - 131072)) : __ldg(tout3_71 + (512 * blockIdx.x + threadIdx.x) % 262144 - 196608));
  }
  aten_cat[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<128 ? 1 : 0) ? __ldg(tinput0_42 + (512 * blockIdx.x + threadIdx.x) % 1048576) : __ldg(tinput0_46 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 524288)) : __ldg(tout3_67 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 786432));
}
}

landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<13:01,  5.83s/it]
Traceback (most recent call last):
  File "inference.py", line 342, in <module>
    main()
  File "inference.py", line 79, in main
    lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt')
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 27, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 55, in extract_keypoint
    return keypoints
UnboundLocalError: local variable 'keypoints' referenced before assignment
#define NEG_INFINITY __int_as_float(0xff800000)

template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}

extern "C" __global__
void fused_cat_cat(float* tinput0_42, float* tinput0_46, float* tout3_67, float* tinput0_60, float* tinput0_52, float* tout3_71, float* aten_cat, float* aten_cat_1) {
{
  if (blockIdx.x<512 ? 1 : 0) {
    aten_cat_1[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<128 ? 1 : 0) ? __ldg(tinput0_60 + (512 * blockIdx.x + threadIdx.x) % 262144) : __ldg(tinput0_52 + (512 * blockIdx.x + threadIdx.x) % 262144 - 131072)) : __ldg(tout3_71 + (512 * blockIdx.x + threadIdx.x) % 262144 - 196608));
  }
  aten_cat[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<128 ? 1 : 0) ? __ldg(tinput0_42 + (512 * blockIdx.x + threadIdx.x) % 1048576) : __ldg(tinput0_46 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 524288)) : __ldg(tout3_67 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 786432));
}
}

landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<13:01,  5.83s/it]
Traceback (most recent call last):
  File "inference.py", line 342, in <module>
    main()
  File "inference.py", line 79, in main
    lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt')
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 27, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 55, in extract_keypoint
    return keypoints
UnboundLocalError: local variable 'keypoints' referenced before assignment
Gltton commented 1 year ago

When I try python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4

After that it start downloading few things and when it start the process it stoped on Landmark det as you see:

(video_retalking) E:\video-retalking>python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Landmarks Extraction in Video.
landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<12:31,  5.61s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)(video_retalking) E:\video-retalking>python inference.py --face examples/face/1.mp4 --audio examples/audio/1.wav --outfile results/1_1.mp4
[Info] Using cuda for inference.
[Step 0] Number of frames available for inference: 135
[Step 1] Landmarks Extraction in Video.
landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<12:31,  5.61s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

#define NAN __int_as_float(0x7fffffff)
#define POS_INFINITY __int_as_float(0x7f800000)
#define NEG_INFINITY __int_as_float(0xff800000)

template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}

extern "C" __global__
void fused_cat_cat(float* tinput0_42, float* tinput0_46, float* tout3_67, float* tinput0_60, float* tinput0_52, float* tout3_71, float* aten_cat, float* aten_cat_1) {
{
  if (blockIdx.x<512 ? 1 : 0) {
    aten_cat_1[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<128 ? 1 : 0) ? __ldg(tinput0_60 + (512 * blockIdx.x + threadIdx.x) % 262144) : __ldg(tinput0_52 + (512 * blockIdx.x + threadIdx.x) % 262144 - 131072)) : __ldg(tout3_71 + (512 * blockIdx.x + threadIdx.x) % 262144 - 196608));
  }
  aten_cat[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<128 ? 1 : 0) ? __ldg(tinput0_42 + (512 * blockIdx.x + threadIdx.x) % 1048576) : __ldg(tinput0_46 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 524288)) : __ldg(tout3_67 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 786432));
}
}

landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<13:01,  5.83s/it]
Traceback (most recent call last):
  File "inference.py", line 342, in <module>
    main()
  File "inference.py", line 79, in main
    lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt')
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 27, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 55, in extract_keypoint
    return keypoints
UnboundLocalError: local variable 'keypoints' referenced before assignment
#define NEG_INFINITY __int_as_float(0xff800000)

template<typename T>
__device__ T maximum(T a, T b) {
  return isnan(a) ? a : (a > b ? a : b);
}

template<typename T>
__device__ T minimum(T a, T b) {
  return isnan(a) ? a : (a < b ? a : b);
}

extern "C" __global__
void fused_cat_cat(float* tinput0_42, float* tinput0_46, float* tout3_67, float* tinput0_60, float* tinput0_52, float* tout3_71, float* aten_cat, float* aten_cat_1) {
{
  if (blockIdx.x<512 ? 1 : 0) {
    aten_cat_1[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 1024) % 256<128 ? 1 : 0) ? __ldg(tinput0_60 + (512 * blockIdx.x + threadIdx.x) % 262144) : __ldg(tinput0_52 + (512 * blockIdx.x + threadIdx.x) % 262144 - 131072)) : __ldg(tout3_71 + (512 * blockIdx.x + threadIdx.x) % 262144 - 196608));
  }
  aten_cat[512 * blockIdx.x + threadIdx.x] = ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<192 ? 1 : 0) ? ((((512 * blockIdx.x + threadIdx.x) / 4096) % 256<128 ? 1 : 0) ? __ldg(tinput0_42 + (512 * blockIdx.x + threadIdx.x) % 1048576) : __ldg(tinput0_46 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 524288)) : __ldg(tout3_67 + (512 * blockIdx.x + threadIdx.x) % 1048576 - 786432));
}
}

landmark Det::   1%|█▊                                                                                                                                                                                                                                             | 1/135 [00:05<13:01,  5.83s/it]
Traceback (most recent call last):
  File "inference.py", line 342, in <module>
    main()
  File "inference.py", line 79, in main
    lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt')
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 27, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "E:\video-retalking\third_part\face3d\extract_kp_videos.py", line 55, in extract_keypoint
    return keypoints
UnboundLocalError: local variable 'keypoints' referenced before assignment

same issue

kunncheng commented 1 year ago

I found a similar issue. This seems to be the result of a mismatch of cuda version and pytorch.

Maybe you should check the version of (GPU, PyTorch, CUDA, CUDNN). Do their versions match? The pytorch installation command I provided works for cuda 11.1.

You can try to reinstall the pytorch and corresponding cudatoolkit by following Pytorch offical website.

AIhasArrived commented 10 months ago

Sorry we haven't test the code on Windows. You can try this: https://stackoverflow.com/questions/20554074/sklearn-omp-error-15-initializing-libiomp5md-dll-but-found-mk2iomp5md-dll-a Or you can try our colab notebook: https://colab.research.google.com/github/vinthony/video-retalking/blob/main/quick_demo.ipynb

Hello, I cannot comment on stackoverflow because my account is new, could you tell me what they meant by the following:

I solved this in a different manner to all the ones here. My script had a matplotlib.pyplot show at the beginning of the script. Moving it to AFTER training removed the error. I'm not an expert on these libraries, but I'm guessing the plot counts as a copy of OpenMP runtime, hence the " multiple copies" error.

? I don't know where is the matplotlib.pyplot line? I searched inference.py and predict.py and webui.py? What do they mean exaclty, where can I modify that? Thanks 2) second question: The link you gave just above, point to older versions of pytorch, this tells me that if I try te reinstall pytorch to solve the issue i should absolutely not install the latest version 2.+? Thanks @kunncheng

AIhasArrived commented 10 months ago

And after reinstalling it, you need to re reun install requirements?

amortegui84 commented 10 months ago

And after reinstalling it, you need to re reun install requirements?

Hey, did you found a solution? i am in the same situation

amortegui84 commented 10 months ago

[Step 0] Number of frames available for inference: 135 [Step 1] Landmarks Extraction in Video. landmark Det:: 1%|▍ | 1/135 [00:11<25:41, 11.50s/it]nvrtc: error: invalid value for --gpu-architecture (-arch)

nvrtc compilation failed:

define NAN __int_as_float(0x7fffffff)

define POS_INFINITY __int_as_float(0x7f800000)

define NEG_INFINITY __int_as_float(0xff800000)

amortegui84 commented 10 months ago

ok, i saw the solution, i have a 4090, and i installed again with this pip install torch==2.0.0+cu118 torchvision==0.15.1+cu118 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

also to create the folder with named checkpoint and to put all the models there

edmondyang81 commented 10 months ago

Im having this problem , is the same ? :O

Traceback (most recent call last): File "inference.py", line 345, in main() File "inference.py", line 82, in main lm = kp_extractor.extract_keypoint(frames_pil, './temp/'+base_name+'_landmarks.txt') File "C:\Users\Edmond\Documents\Edmond\ai\video-retalking\third_part\face3d\extract_kp_videos.py", line 36, in extract_keypoint np.savetxt(os.path.splitext(name)[0]+'.txt', keypoints.reshape(-1)) File "<__array_function__ internals>", line 180, in savetxt File "C:\Users\Edmond\anaconda3\envs\video_retalking\lib\site-packages\numpy\lib\npyio.py", line 1503, in savetxt open(fname, 'wt').close() FileNotFoundError: [Errno 2] No such file or directory: './temp/examples\narve.mp4_landmarks.txt'