UttaranB127 / speech2affective_gestures

This is the official implementation of the paper "Speech2AffectiveGestures: Synthesizing Co-Speech Gestures with Generative Adversarial Affective Expression Learning".
https://gamma.umd.edu/s2ag/
MIT License
44 stars 9 forks source link

data_loader self.num_total_samples = 0 #5

Closed zf223669 closed 2 years ago

zf223669 commented 2 years ago

Hi, In processor_v2.py, I found that the self.num_total_samples = 0, I try to debug and show that the variation n_samples in self.data_loader['train_data_s2ag']/[eval_data_s2ag]/[test_data_s2ag] is zero. what`s that problem? Thanks!

zf223669 commented 2 years ago

Second, I intend to train the model, however, it showed that : FileNotFoundError: [Errno 2] No such file or directory: 'outputs/trimodal_gen.pth.tar

UttaranB127 commented 2 years ago

Is your data loaded correctly into the code or are there any warnings there? Also, does your "outputs" folder exist? Otherwise, the network may not be able to create the "trimodal_gen.pth.tar" folder.

UttaranB127 commented 2 years ago

Following up on the "trimodal_gen.pth.tar" not found error, we do have the trained weights available. However, we do not own the code for generating these weights and the results should be verified with the original source.

Amir3022 commented 2 years ago

Greetings, I still have this same problem of self.num_total_samples=0, as well as train, eval and test samples are all equal to zero. The on;y waring I get when trying to run the model so far is "Warning : load_model does not return WordVectorModel or SupervisedModel any more, but a FastText object which is very similar." Which as far as I know, is a deprecation warning and shouldn't cause any problems. I have the ted_db data and the fasttext 'crawl_300d_2M_subword.bin' in the right location. Although the fasttext bin file and the 'NRC_VAD_Lexicon' I donwloaded them from sources no provided in this repo. So any help with this issue?

UttaranB127 commented 2 years ago

Are the folders lmdb_train_s2ag_v2_cache_mfcc_14, lmdb_val_s2ag_v2_cache_mfcc_14, and lmdb_test_s2ag_v2_cache_mfcc_14 generated? Each of these folders should contain two files, data.mdb and lock.mdb. The size of each lock.mdb is 8 KB and the data.mdb files have sizes of a few GBs (the one for 'train' is a few hundred GBs). If these files are present but the sizes do not match, please delete them and rerun the code to re-generate them.

zf223669 commented 2 years ago

Hello: i did followed your suggest,however, it got an other error: /home/zf223669/Mount/anaconda3/envs/s2ag/bin/python3.7 /home/zf223669/Mount/s2ag/s2ag/main_v2.py -c /home/zf223669/Mount/s2ag/s2ag/config/multimodal_context_v2.yml ../data/NRC-VAD-Lexicon-Aug2018Release/NRC-VAD-Lexicon.txt ../data Reading data '../data/ted_db/lmdb_train'... Found the cache ../data/ted_db/lmdb_train_s2ag_v2_cache_mfcc_14 Reading data '../data/ted_db/lmdb_val'... Found the cache ../data/ted_db/lmdb_val_s2ag_v2_cache_mfcc_14 Reading data '../data/ted_db/lmdb_test'... Found the cache ../data/ted_db/lmdb_test_s2ag_v2_cache_mfcc_14 building a language model... loaded from ../data/ted_db/vocab_models_s2ag/vocab_cache.pkl ++++++++++++++0 +++0 +++0 Training s2ag with batch size: 512 Loading train cache took 1 seconds. Loading eval cache took 1 seconds. Traceback (most recent call last): File "/home/zf223669/Mount/s2ag/s2ag/main_v2.py", line 132, in pr.train() File "/home/zf223669/Mount/s2ag/s2ag/processor_v2.py", line 979, in train self.trimodal_generator.load_state_dict(trimodal_checkpoint['trimodal_gen_dict']) File "/home/zf223669/Mount/anaconda3/envs/s2ag/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for PoseGeneratorTriModal: size mismatch for text_encoder.embedding.weight: copying a param with shape torch.Size([29460, 300]) from checkpoint, the shape in current model is torch.Size([26619, 300]).

Amir3022 commented 2 years ago

Are the folders lmdb_train_s2ag_v2_cache_mfcc_14, lmdb_val_s2ag_v2_cache_mfcc_14, and lmdb_test_s2ag_v2_cache_mfcc_14 generated? Each of these folders should contain two files, data.mdb and lock.mdb. The size of each lock.mdb is 8 KB and the data.mdb files have sizes of a few GBs (the one for 'train' is a few hundred GBs). If these files are present but the sizes do not match, please delete them and rerun the code to re-generate them.

Those files are generated at the "ted_db" folder, but each has a size of around 16kb (the whole folder with including data and lock.mdb). Granted, I only have 50GB of free space in the hard drive, although at the first run it was complaining as I had only 40GB so I increased that to 50GB and the code ran properly after that till I ran into this problem. So what is the needed freespace for the model to run properly without problem?

I was hoping to use the pre-trained model as I didn't need to retrain the model and would like to use the inference directly for output. However, there is no command line for --train in the arguments of main_v2.py. There is a similar one, which is --train-s2ag, but setting that to False, still requires the data to be present, and I end up with the same error as the one described before. So is it possible to use the model without retraining it, using the provided pre-trained model?

UttaranB127 commented 2 years ago

Are the folders lmdb_train_s2ag_v2_cache_mfcc_14, lmdb_val_s2ag_v2_cache_mfcc_14, and lmdb_test_s2ag_v2_cache_mfcc_14 generated? Each of these folders should contain two files, data.mdb and lock.mdb. The size of each lock.mdb is 8 KB and the data.mdb files have sizes of a few GBs (the one for 'train' is a few hundred GBs). If these files are present but the sizes do not match, please delete them and rerun the code to re-generate them.

Those files are generated at the "ted_db" folder, but each has a size of around 16kb (the whole folder with including data and lock.mdb). Granted, I only have 50GB of free space in the hard drive, although at the first run it was complaining as I had only 40GB so I increased that to 50GB and the code ran properly after that till I ran into this problem. So what is the needed freespace for the model to run properly without problem?

I was hoping to use the pre-trained model as I didn't need to retrain the model and would like to use the inference directly for output. However, there is no command line for --train in the arguments of main_v2.py. There is a similar one, which is --train-s2ag, but setting that to False, still requires the data to be present, and I end up with the same error as the one described before. So is it possible to use the model without retraining it, using the provided pre-trained model?

I have debugged some cmd line argument issues to make sure the code does not require the full training data to be loaded if you just want to test the network. However, you still need to have the lmdb_train, lmdb_val and lmdb_test folders with the data.mdb and lock.mdb files as they contain relevant metadata. You do not need the additional cache or npz folders.

UttaranB127 commented 2 years ago

Hello: i did followed your suggest,however, it got an other error: /home/zf223669/Mount/anaconda3/envs/s2ag/bin/python3.7 /home/zf223669/Mount/s2ag/s2ag/main_v2.py -c /home/zf223669/Mount/s2ag/s2ag/config/multimodal_context_v2.yml ../data/NRC-VAD-Lexicon-Aug2018Release/NRC-VAD-Lexicon.txt ../data Reading data '../data/ted_db/lmdb_train'... Found the cache ../data/ted_db/lmdb_train_s2ag_v2_cache_mfcc_14 Reading data '../data/ted_db/lmdb_val'... Found the cache ../data/ted_db/lmdb_val_s2ag_v2_cache_mfcc_14 Reading data '../data/ted_db/lmdb_test'... Found the cache ../data/ted_db/lmdb_test_s2ag_v2_cache_mfcc_14 building a language model... loaded from ../data/ted_db/vocab_models_s2ag/vocab_cache.pkl ++++++++++++++0 +++0 +++0 Training s2ag with batch size: 512 Loading train cache took 1 seconds. Loading eval cache took 1 seconds. Traceback (most recent call last): File "/home/zf223669/Mount/s2ag/s2ag/main_v2.py", line 132, in pr.train() File "/home/zf223669/Mount/s2ag/s2ag/processor_v2.py", line 979, in train self.trimodal_generator.load_state_dict(trimodal_checkpoint['trimodal_gen_dict']) File "/home/zf223669/Mount/anaconda3/envs/s2ag/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for PoseGeneratorTriModal: size mismatch for text_encoder.embedding.weight: copying a param with shape torch.Size([29460, 300]) from checkpoint, the shape in current model is torch.Size([26619, 300]).

I am actually unable to replicate this error as my trimodal_generator matches all keys successfully. I have re-uploaded the file trimodal_gen.pth.tar from a different local address. Maybe try downloading it again?

Amir3022 commented 2 years ago

Are the folders lmdb_train_s2ag_v2_cache_mfcc_14, lmdb_val_s2ag_v2_cache_mfcc_14, and lmdb_test_s2ag_v2_cache_mfcc_14 generated? Each of these folders should contain two files, data.mdb and lock.mdb. The size of each lock.mdb is 8 KB and the data.mdb files have sizes of a few GBs (the one for 'train' is a few hundred GBs). If these files are present but the sizes do not match, please delete them and rerun the code to re-generate them.

Those files are generated at the "ted_db" folder, but each has a size of around 16kb (the whole folder with including data and lock.mdb). Granted, I only have 50GB of free space in the hard drive, although at the first run it was complaining as I had only 40GB so I increased that to 50GB and the code ran properly after that till I ran into this problem. So what is the needed freespace for the model to run properly without problem? I was hoping to use the pre-trained model as I didn't need to retrain the model and would like to use the inference directly for output. However, there is no command line for --train in the arguments of main_v2.py. There is a similar one, which is --train-s2ag, but setting that to False, still requires the data to be present, and I end up with the same error as the one described before. So is it possible to use the model without retraining it, using the provided pre-trained model?

I have debugged some cmd line argument issues to make sure the code does not require the full training data to be loaded if you just want to test the network. However, you still need to have the lmdb_train, lmdb_val and lmdb_test folders with the data.mdb and lock.mdb files as they contain relevant metadata. You do not need the additional cache or npz folders.

Okay I will try the model again tomorrow, and see if I can get it to run the inference without having to retrain it. Will keep you updated with my results, and if there are still any issues.

zf223669 commented 2 years ago

May I suggest that you describe the installation, configuration and running process in more detail, such as which directory is the downloaded dataset placed, how to configure the running parameters, etc.? Thank you!

UttaranB127 commented 2 years ago

The running parameters are straightforward and available easily from the command-line argument descriptors as well as the main paper. The readme already contains details on where each downloaded should be placed, I will add in more details based on the issues that are being raised and resolved. Meanwhile, let me know if your current issue is resolved.

zf223669 commented 2 years ago

hi, I have downloaded the "The Trinity Gesture dataset" , however, it contained many files, which one should I load?

zf223669 commented 2 years ago

Hello, I recloned all the project again, modified the basepath. created data folder, and put the fasttext, GENEA_Challenge_2000_data_release, NRC_VAD_Lexicon_Aug2018Release, ted_db datasets in it and get ready to run the main_v2.py I typed the instruct: python3 main_v2.py -c /home/zf223669/Mount/speech2affective_gestures/config/multimodal_context_v2.yml and run it. However , it showed an error about the mfcc, showed below: /home/zf223669/Mount/anaconda3/envs/s2ag/bin/python3.7 /home/zf223669/Mount/speech2affective_gestures/main_v2.py -c /home/zf223669/Mount/speech2affective_gestures/config/multimodal_context_v2.yml Reading data '/home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_train'... Creating the dataset cache... Traceback (most recent call last): File "/home/zf223669/Mount/speech2affective_gestures/main_v2.py", line 121, in train_data_ted, val_data_ted, test_data_ted = loader.load_ted_db_data(data_path, s2ag_config_args, args.train_s2ag) File "/home/zf223669/Mount/speech2affective_gestures/loader_v2.py", line 580, in load_ted_db_data remove_word_timing=(config_args.input_context == 'text') File "/home/zf223669/Mount/speech2affective_gestures/loader_v2.py", line 482, in init data_sampler.run() File "/home/zf223669/Mount/speech2affective_gestures/utils/data_preprocessor.py", line 56, in run filtered_result = self._sample_from_clip(vid, clip) File "/home/zf223669/Mount/speech2affective_gestures/utils/data_preprocessor.py", line 140, in _sample_from_clip sample_mfcc_combined = get_mfcc_features(sample_audio, sr=16000, num_mfcc=self.num_mfcc) File "/home/zf223669/Mount/speech2affective_gestures/utils/common.py", line 342, in get_mfcc_features mfcc_features = mfcc(audio, sr=sr, n_mfcc=num_mfcc) / 1000. TypeError: mfcc() takes 0 positional arguments but 1 positional argument (and 2 keyword-only arguments) were given

What`s that problem? :(

zf223669 commented 2 years ago

I have try many times, however, it showed some weird problems. May I recommend that you could clone your project in another place and try to fix the bug? I now get stuck at the lang_model = pickle.load(f) show below: /home/zf223669/Mount/anaconda3/envs/s2ag/bin/python3.7 /home/zf223669/Mount/speech2affective_gestures/main_v2.py -c /home/zf223669/Mount/speech2affective_gestures/config/multimodal_context_v2.yml Reading data '/home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_train'... Found the cache /home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_train_s2ag_v2_cache_mfcc_14 Reading data '/home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_val'... Found the cache /home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_val_s2ag_v2_cache_mfcc_14 Reading data '/home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_test'... Found the cache /home/zf223669/Mount/speech2affective_gestures/data/ted_db/lmdb_test_s2ag_v2_cache_mfcc_14 building a language model... loaded from /home/zf223669/Mount/speech2affective_gestures/data/ted_db/vocab_models_s2ag/vocab_cache.pkl Traceback (most recent call last): File "/home/zf223669/Mount/speech2affective_gestures/main_v2.py", line 121, in train_data_ted, val_data_ted, test_data_ted = loader.load_ted_db_data(data_path, s2ag_config_args, args.train_s2ag) File "/home/zf223669/Mount/speech2affective_gestures/loader_v2.py", line 611, in load_ted_db_data config_args.wordembed_dim) File "/home/zf223669/Mount/speech2affective_gestures/utils/vocab_utils.py", line 29, in build_vocab lang_model = pickle.load(f) ModuleNotFoundError: No module named 'model'

UttaranB127 commented 2 years ago

Let me look into these errors for you. These errors seem a bit unusual and I don't recall coming across them myself. Might be a case of version mismatches or some missing files, but I will try to replicate your errors and update the repo and readme accordingly. It might take some time though.

zf223669 commented 2 years ago

Thank you!! :)

UttaranB127 commented 2 years ago

Hi, it turns out that some of the packages were deprecated without compatibility since we released our code. As a result, preprocessing the data and later running training/inference needs you to have different versions of basic packages such as numpy installed. This requires strict versioning and modulation of our codebase that are beyond our scope at the moment. To circumnavigate the issue, I am uploading the preprocessed dataset in a single folder that you can download and directly use for training/inference. Please keep the entire contents of the download in the folder data/ted_db and point the variable data_path in main_v2.py to the data folder. I have also revised the code to make the data loading faster. Let me know if you still face issues.

UttaranB127 commented 2 years ago

Also, I am closing this issue at it seems more general than the original title suggests. Please pose any follow-up in issue #7.