andrewowens / multisensory

Code for the paper: Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
http://andrewowens.com/multisensory/
Apache License 2.0
220 stars 60 forks source link

I RuntimeError: Command failed! ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4" #24

Closed ghost closed 5 years ago

ghost commented 5 years ago

python sep_video.py data/translator.mp4 --model unet_pit --duration_mult 4 --out results/ Start time: 0.0 GPU = 0 Spectrogram samples: 512 (8.298, 8.288) 100.0% complete, total time: 0:00:00. 0:00:00 per iteration. (11:29 AM Tue) Struct(alg=sourcesep, augment_audio=False, augment_ims=True, augment_rms=False, base_lr=0.0001, batch_size=24, bn_last=True, bn_scale=True, both_videos_in_batch=False, cam=False, check_iters=1000, crop_im_dim=224, dilate=False, do_shift=False, dset_seed=None, fix_frame=False, fps=29.97, frame_length_ms=64, frame_sample_delta=74.5, frame_step_ms=16, freq_len=1024, full_im_dim=256, full_model=False, full_samples_len=105000, gamma=0.1, gan_weight=0.0, grad_clip=10.0, im_split=False, im_type=jpeg, init_path=None, init_type=shift, input_rms=0.14142135623730953, l1_weight=1.0, log_spec=True, loss_types=['pit'], model_path=results/nets/sep/unet-pit/net.tf-160000, mono=False, multi_shift=False, net_style=no-im, normalize_rms=True, num_dbs=None, num_samples=173774, opt_method=adam, pad_stft=False, phase_type=pred, phase_weight=0.01, pit_weight=1.0, predict_bg=True, print_iters=10, profile_iters=None, resdir=/home/study/PycharmProjects/results/nets/sep/unet-pit, samp_sr=21000.0, sample_len=None, sampled_frames=248, samples_per_frame=700.7007007007007, show_iters=None, show_videos=False, slow_check_iters=10000, spec_len=512, spec_max=80.0, spec_min=-100.0, step_size=120000, subsample_frames=None, summary_iters=10, test_batch=10, test_list=../data/celeb-tf-v6-full/test/tf, total_frames=149, train_iters=160000, train_list=../data/celeb-tf-v6-full/train/tf, use_3d=True, use_sound=True, use_wav_gan=False, val_list=../data/celeb-tf-v6-full/val/tf, variable_frame_count=False, vid_dur=8.288, weightdecay=1e-05) ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -r 29.97 -vf scale=256:256 "/tmp/tmpw4889ppn/small%04d.png" ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -r 29.97 -vf "scale=-2:'min(600,ih)'" "/tmp/tmpw4889ppn/full_%04d.png" ffmpeg -loglevel error -ss 0.0 -i "data/translator.mp4" -safe 0 -t 8.338000000000001 -ar 21000.0 -ac 2 "/tmp/tmpw4889ppn/sound.wav" Running on: 2019-05-14 11:29:30.212532: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2019-05-14 11:29:30.329825: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-05-14 11:29:30.330229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties: name: GeForce RTX 2070 major: 7 minor: 5 memoryClockRate(GHz): 1.62 pciBusID: 0000:01:00.0 totalMemory: 7.77GiB freeMemory: 7.19GiB 2019-05-14 11:29:30.330244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0 2019-05-14 11:29:30.547596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-05-14 11:29:30.547627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0 2019-05-14 11:29:30.547632: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N 2019-05-14 11:29:30.547797: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6920 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070, pci bus id: 0000:01:00.0, compute capability: 7.5) Raw spec length: [1, 514, 1025] Truncated spec length: [1, 512, 1025] ('gen/conv1', [1, 512, 1024, 2], '->', [1, 512, 512, 64]) ('gen/conv2', [1, 512, 512, 64], '->', [1, 512, 256, 128]) ('gen/conv3', [1, 512, 256, 128], '->', [1, 256, 128, 256]) ('gen/conv4', [1, 256, 128, 256], '->', [1, 128, 64, 512]) ('gen/conv5', [1, 128, 64, 512], '->', [1, 64, 32, 512]) ('gen/conv6', [1, 64, 32, 512], '->', [1, 32, 16, 512]) ('gen/conv7', [1, 32, 16, 512], '->', [1, 16, 8, 512]) ('gen/conv8', [1, 16, 8, 512], '->', [1, 8, 4, 512]) ('gen/conv9', [1, 8, 4, 512], '->', [1, 4, 2, 512]) ('gen/deconv1', [1, 4, 2, 512], '->', [1, 8, 4, 512]) ('gen/deconv2', [1, 8, 4, 1024], '->', [1, 16, 8, 512]) ('gen/deconv3', [1, 16, 8, 1024], '->', [1, 32, 16, 512]) ('gen/deconv4', [1, 32, 16, 1024], '->', [1, 64, 32, 512]) ('gen/deconv5', [1, 64, 32, 1024], '->', [1, 128, 64, 512]) ('gen/deconv6', [1, 128, 64, 1024], '->', [1, 256, 128, 256]) ('gen/deconv7', [1, 256, 128, 512], '->', [1, 512, 256, 128]) ('gen/deconv8', [1, 512, 256, 256], '->', [1, 512, 512, 64]) ('gen/fg', [1, 512, 512, 128], '->', [1, 512, 1024, 2]) ('gen/bg', [1, 512, 512, 128], '->', [1, 512, 1024, 2]) Restoring from: results/nets/sep/unet-pit/net.tf-160000 predict samples shape: (1, 173774, 2) samples pred shape: (1, 173774, 2) (512, 1025) Writing to: results/ ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4" [wav @ 0x558b3f868b40] Estimating duration from bitrate, this may be inaccurate [wav @ 0x558b3f868b40] Could not find codec parameters for stream 0 (Audio: none, 1065353216 Hz, 16256 channels, 9481256 kb/s): unknown codec Consider increasing the value for the 'analyzeduration' and 'probesize' options Unknown encoder 'h264' Traceback (most recent call last): File "sep_video.py", line 455, in ut.make_video(full_ims, pr.fps, pj(arg.out, 'fg%s.mp4' % name), snd(full_samples_fg)) File "/home/study/PycharmProjects/untitled/util.py", line 3176, in make_video % (sound_flags_in, fps, input_file, sound_flags_out, flags, out_fname)) File "/home/study/PycharmProjects/untitled/util.py", line 917, in sys_check fail('Command failed! %s' % cmd) File "/home/study/PycharmProjects/untitled/util.py", line 14, in fail def fail(s = ''): raise RuntimeError(s) RuntimeError: Command failed! ffmpeg -i "/tmp/ao_wmjz0ezg.wav" -r 29.970000 -loglevel warning -safe 0 -f concat -i "/tmp/ao_i2pwi0b8.txt" -pix_fmt yuv420p -vcodec h264 -strict -2 -y -acodec aac "results/fg_translator.mp4"

I meet the problem like this... I run the code as you say... but what happend to this code? I run the code on python3,thank you for your prompt reply!!!

ghost commented 5 years ago

I just want to separate a mixfile(wav format),but it's so many errors...when I choose a mix.wav to takeplace the translator.mp4,

andrewowens commented 5 years ago

The version of ffmpeg that you have cannot write an h264 video, since it is missing the codec. You could reinstall it, or you could try a precompiled version: e.g. https://johnvansickle.com/ffmpeg/.

ghost commented 5 years ago

The version of ffmpeg that you have cannot write an h264 video, since it is missing the codec. You could reinstall it, or you could try a precompiled version: e.g. https://johnvansickle.com/ffmpeg/. thank you for your quickly answer,and can I choose a mix.wav to takeplace the translator.mp4

ghost commented 5 years ago

excuse me...can you see my question? can you give me a little help,I don;t konw how to go on after downloading the" ffmpeg-release-amd64-static.tar.xz - md5" what should I do next?