Inference freezes in beginning

alter-sachin commented 3 years ago

I am running on a K80 Tesla GPU and get the following error.

python3.6 batch_inference.py --checkpoint_path logs/lipgan_residual_mel.h5 --face 1.mp4 --fps 24 --audio dhanjal.wav --results_dir results/

/home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:522: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/sachinpec354/LipGAN/l/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Using TensorFlow backend. Number of frames available for inference: 164 (80, 246) Length of mel chunks: 66 0%| | 0/2 [00:00<?, ?it/sLLVM ERROR: out of memory | 0/3 [00:00<?, ?it/s] Aborted (core dumped)

I reduced the batch size to 64.

prajwalkr commented 3 years ago

LLVM ERROR: out of memory. the code has crashed due to this error. We have not faced this issue? Maybe a lookup on google can point out why this error comes?

alter-sachin commented 3 years ago

I increased my RAM CPU and the LLVM Error goes away But the inference does not begin. it is still stuck at 0%

prajwalkr commented 3 years ago

Can you add print statements in different places and figure out where it is freezing?

alter-sachin commented 3 years ago

It does not go ahead from this line

https://github.com/Rudrabha/LipGAN/blob/65e528449a80c048a36214d356c39e215e5a41f7/batch_inference.py#L187

prajwalkr commented 3 years ago

It is stuck inside the data generator then. Can you see which line inside that?

alter-sachin commented 3 years ago

gen = datagen(full_frames.copy(), mel_chunks) print("after datagen")

for i, (img_batch, mel_batch, frames, coords) in enumerate(tqdm(gen, 
                                        total=int(np.ceil(float(len(mel_chunks))/batch_size)))):
    if i == 0:
        print("before create model")
        model = create_model(args, mel_step_size)
        print ("Model Created")

        model.load_weights(args.checkpoint_path)
        print ("Model loaded")

        frame_h, frame_w = full_frames[0].shape[:-1]
        print("before video writer")
        out = cv2.VideoWriter(path.join(args.results_dir, 'result.avi'), 
                                cv2.VideoWriter_fourcc(*'DIVX'), fps, (frame_w, frame_h))

The after datagen print statement is outputted. so i think its in the beginning of the loop? I might be wrong...

alter-sachin commented 3 years ago

I receive an out of memory error again. Is a k80 GPU enough for running this? Seems to have enough GPU RAM(25GB I think)

alter-sachin commented 3 years ago

Sidenote : The single image example runs quite well for me

prajwalkr commented 3 years ago

I receive an out of memory error again. Is a k80 GPU enough for running this? Seems to have enough GPU RAM(25GB I think)

Can you paste the error message?

alter-sachin commented 3 years ago

LVM Error : Out of Memory

prajwalkr commented 3 years ago

Hello, sorry that we are unable to help you as we do not have sufficient information, nor have we faced such an error. You can try out Wav2Lip, which is a significant improvement of this work.

Rudrabha / LipGAN

Inference freezes in beginning #36