EvelynFan / FaceFormer

[CVPR 2022] FaceFormer: Speech-Driven 3D Facial Animation with Transformers
MIT License
783 stars 133 forks source link

bugs when run demo #3

Closed cavalleria closed 2 years ago

cavalleria commented 2 years ago

The cmd is python demo.py --model_name vocaset --wav_path "demo/wav/test.wav" --dataset vocaset --vertice_dim 15069 --feature_dim 64 --period 30 --fps 30 --train_subjects "FaceTalk_170728_03272_TA FaceTalk_170904_00128_TA FaceTalk_170725_00137_TA FaceTalk_170915_00223_TA FaceTalk_170811_03274_TA FaceTalk_170913_03279_TA FaceTalk_170904_03276_TA FaceTalk_170912_03278_TA" --test_subjects "FaceTalk_170809_00138_TA FaceTalk_170731_00024_TA" --condition FaceTalk_170913_03279_TA --subject FaceTalk_170809_00138_TA

output:

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

cavalleria commented 2 years ago

virtual enviromnent packages:

Package Version


absl-py 1.0.0 antlr4-python3-runtime 4.8 appdirs 1.4.4 argon2-cffi 20.1.0 async-generator 1.10 attrs 21.4.0 audiolazy 0.6 audioread 2.1.9 backcall 0.2.0 bleach 4.1.0 boto3 1.21.1 boto3-stubs 1.21.1 botocore 1.24.1 botocore-stubs 1.24.1 brotlipy 0.7.0 cachetools 4.2.4 certifi 2021.10.8 cffi 1.15.0 charset-normalizer 2.0.12 click 8.0.4 cryptography 35.0.0 cycler 0.11.0 debugpy 1.5.1 decorator 5.1.1 defusedxml 0.7.1 dill 0.3.4 docker-pycreds 0.4.0 entrypoints 0.3 filelock 3.6.0 flake8 3.8.2 fonttools 4.30.0 freetype-py 2.2.0 gitdb 4.0.9 GitPython 3.1.27 google-api-core 2.5.0 google-auth 2.0.2 google-auth-oauthlib 0.4.6 google-cloud-core 2.2.2 google-cloud-storage 1.42.3 google-crc32c 1.3.0 google-resumable-media 2.2.1 googleapis-common-protos 1.54.0 grpcio 1.43.0 hub 2.2.3 huggingface-hub 0.4.0 humbug 0.2.7 idna 3.3 imageio 2.16.1 importlib-metadata 4.11.0 iniconfig 1.1.1 ipykernel 6.4.1 ipython 7.31.1 ipython-genutils 0.2.0 ipywidgets 7.6.5 jedi 0.18.1 Jinja2 3.0.2 jmespath 0.10.0 joblib 1.1.0 jsonschema 3.2.0 jupyter 1.0.0 jupyter-client 7.1.2 jupyter-console 6.4.0 jupyter-contrib-core 0.3.3 jupyter-contrib-nbextensions 0.5.1 jupyter-core 4.9.1 jupyter-highlight-selected-word 0.2.0 jupyter-latex-envs 1.4.6 jupyter-nbextensions-configurator 0.4.1 jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.0.0 kiwisolver 1.3.2 librosa 0.9.1 llvmlite 0.37.0 lxml 4.8.0 lz4 4.0.0 Markdown 3.3.6 MarkupSafe 2.0.1 matplotlib 3.5.1 matplotlib-inline 0.1.2 mccabe 0.6.1 miniaudio 1.46 mistune 0.8.4 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 multiprocess 0.70.12.2 mypy-boto3-cloudformation 1.21.0 mypy-boto3-dynamodb 1.21.0 mypy-boto3-ec2 1.21.1 mypy-boto3-lambda 1.21.0 mypy-boto3-rds 1.21.0 mypy-boto3-s3 1.21.0 mypy-boto3-sqs 1.21.0 nbclient 0.5.3 nbconvert 6.3.0 nbformat 5.1.3 nest-asyncio 1.5.1 networkx 2.7.1 notebook 6.4.8 numba 0.54.1 numcodecs 0.9.1 numpy 1.22.3 oauthlib 3.2.0 olefile 0.46 omegaconf 2.1.1 opencv-python 4.5.5.64 packaging 21.3 pandas 1.4.1 pandocfilters 1.5.0 parso 0.8.3 pathos 0.2.8 pathtools 0.1.2 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.0.1 pip 21.2.4 pluggy 1.0.0 pooch 1.6.0 pox 0.3.0 ppft 1.6.6.4 prometheus-client 0.13.1 promise 2.3 prompt-toolkit 3.0.20 protobuf 3.19.4 psbody-mesh 0.4 psutil 5.9.0 ptyprocess 0.7.0 py 1.11.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycodestyle 2.6.0 pycparser 2.21 pyflakes 2.2.0 pyglet 1.5.22 Pygments 2.11.2 PyOpenGL 3.1.6 pyOpenSSL 22.0.0 pyparsing 3.0.7 pypinyin 0.46.0 pyrender 0.1.45 pyrsistent 0.18.0 PySocks 1.7.1 pytest 7.0.1 python-dateutil 2.8.2 python-speech-features 0.6 pytz 2021.3 PyYAML 6.0 pyzmq 22.3.0 qtconsole 5.2.2 QtPy 1.11.2 regex 2022.3.15 requests 2.27.1 requests-oauthlib 1.3.1 resampy 0.2.2 rsa 4.8 s3transfer 0.5.1 sacremoses 0.0.49 scikit-learn 1.0.2 scipy 1.8.0 Send2Trash 1.8.0 sentencepiece 0.1.96 sentry-sdk 1.5.6 setproctitle 1.2.2 setuptools 60.10.0 shortuuid 1.0.8 sip 4.19.13 six 1.16.0 smmap 5.0.0 SoundFile 0.10.3.post1 tensorboard 2.8.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorboardX 2.4.1 termcolor 1.1.0 terminado 0.13.1 testpath 0.5.0 TextGrid 1.5 threadpoolctl 3.1.0 tokenizers 0.11.6 tomli 2.0.1 torch 1.10.0 torchaudio 0.10.0 torchsummaryX 1.3.0 torchvision 0.11.1 tornado 6.1 tqdm 4.62.3 traitlets 5.1.1 transformers 4.17.0 trimesh 3.10.3 typeguard 2.13.3 types-click 7.1.8 types-requests 2.27.10 types-urllib3 1.26.9 typing_extensions 4.0.1 urllib3 1.26.8 wandb 0.12.11 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 2.0.3 wheel 0.37.1 widgetsnbextension 3.5.2 yapf 0.32.0 yaspin 2.1.0 zipp 3.7.0

EvelynFan commented 2 years ago

The cmd is python demo.py --model_name vocaset --wav_path "demo/wav/test.wav" --dataset vocaset --vertice_dim 15069 --feature_dim 64 --period 30 --fps 30 --train_subjects "FaceTalk_170728_03272_TA FaceTalk_170904_00128_TA FaceTalk_170725_00137_TA FaceTalk_170915_00223_TA FaceTalk_170811_03274_TA FaceTalk_170913_03279_TA FaceTalk_170904_03276_TA FaceTalk_170912_03278_TA" --test_subjects "FaceTalk_170809_00138_TA FaceTalk_170731_00024_TA" --condition FaceTalk_170913_03279_TA --subject FaceTalk_170809_00138_TA

output:

Some weights of the model checkpoint at facebook/wav2vec2-base-960h were not used when initializing Wav2Vec2Model: ['lm_head.bias', 'lm_head.weight']

  • This IS expected if you are initializing Wav2Vec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
  • This IS NOT expected if you are initializing Wav2Vec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of Wav2Vec2Model were not initialized from the model checkpoint at facebook/wav2vec2-base-960h and are newly initialized: ['wav2vec2.masked_spec_embed'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Traceback (most recent call last): File "demo.py", line 204, in main() File "demo.py", line 200, in main test_model(args) File "demo.py", line 57, in test_model prediction = model.predict(audio_feature, template, one_hot) File "/evo_860/yaobin.li/workspace/FaceFormer/faceformer.py", line 140, in predict hidden_states = self.audio_encoder(audio, self.dataset).last_hidden_state File "/home/yaobin.li/soft/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/evo_860/yaobin.li/workspace/FaceFormer/wav2vec.py", line 135, in forward encoder_outputs = self.encoder( File "/home/yaobin.li/soft/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/yaobin.li/soft/miniconda3/envs/wenet/lib/python3.8/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 812, in forward position_embeddings = self.pos_conv_embed(hidden_states) File "/home/yaobin.li/soft/miniconda3/envs/wenet/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/yaobin.li/soft/miniconda3/envs/wenet/lib/python3.8/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py", line 446, in forward hidden_states = hidden_states.transpose(1, 2) AttributeError: 'tuple' object has no attribute 'transpose'

Hi, the version of transformers is 4.6.1. Thanks for pointing it out. I will add the version info.