Not able to run the pre-trained model

Rohith-coder1 commented 1 year ago

Hi sir, I cloned the repo and installed all dependencies but when I am trying to run, it throws error saying unrecognoized arguments

Rohith-coder1 commented 1 year ago

Screenshot 2022-12-16 083437

UttaranB127 commented 1 year ago

Please note that the command line takes named arguments, so you have to write in the format of python main_v2.py --<arg name> <arg val>

Rohith-coder1 commented 1 year ago

yeah I have given it in that format

Rohith-coder1 commented 1 year ago

parameter

UttaranB127 commented 1 year ago

I re-checked the code now and it runs correctly on my machine. I made a slight change in the path for parse_args.py, maybe try pulling it and it could help with your issue? Also:

Have you tried running with the default arguments (i.e., not providing any explicit command line arguments)?
Can you share your command in a text format so I can try to test it if needed?

Rohith-coder1 commented 1 year ago

Command: python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c True --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda False --pavi-log False --print-log True --save-log True

UttaranB127 commented 1 year ago

Thanks for sharing, I'll take a look when I get some time. Meanwhile, I have checked that the code works without any of the arguments except for --config (where you have to provide the path to a .yml file), so let me know if that also works for you.

UttaranB127 commented 1 year ago

Looking into your command-line call, I noticed a few errors:

The "-c"/"--config" argument takes in the path to a .yml file, not a boolean.
The arguments "--no-cuda", "--pave-log", "--print-log", and "--save-log" do not take any arguments, you just use those arguments if you need to perform the relevant actions.
The argument "--nesterov" did not take an argument (like the ones in the previous point), but it should. I've fixed the argument parsing so it now takes a boolean argument. Please make sure you pull the latest code to reflect these changes on your end.

Here is a corrected version of your command-line call:

python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c config/multimodal_context_v2.yml --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda --pavi-log --print-log --save-log

Rohith-coder1 commented 1 year ago

Thank you so much. Will execute and update.

Rohith-coder1 commented 1 year ago

I tried running with the updated command, I am trying to run on a Windows Machine with no GPU. I am again getting some PATH issues. I gave my Project and data path correctly in the mainv2.py. Is there any other place where I should give the path.

Rohith-coder1 commented 1 year ago

Screenshot 2022-12-20 131051

UttaranB127 commented 1 year ago

Yes, please make sure the paths are correct in both the main python file and the .yml config file.

Rohith-coder1 commented 1 year ago

Screenshot 2022-12-20 131448

UttaranB127 commented 1 year ago

The paths in loader_v2.py come from the .yml file, so please make sure those are accurate.

Rohith-coder1 commented 1 year ago

I changed all the path, now the code is running but I am getting few errors due to CUDA. I made the no-cuda default to True. Is that the correct procedure or is there any place I have to change the CUDA specifications

Rohith-coder1 commented 1 year ago

Screenshot 2022-12-23 153252

UttaranB127 commented 1 year ago

Try without the multiple GPU flag, so just using a single GPU. The parallelization code may have some issues due to pytorch versioning, which will require separate debugging.

Rohith-coder1 commented 1 year ago

Okay, but I am not having GPU in my system. So the code won't work for systems with NO - GPU?

UttaranB127 commented 1 year ago

If you check line 93 in processor_v2.py (previous commit), the code automatically switches to CPU if no GPU is available. I have made this more explicit in the code so it pre-emptively follows the --no-cuda argument when the argument is present and made a new commit. You can pull the latest changes or just copy lines 93 to 105 in processor_v2.py.

Rohith-coder1 commented 1 year ago

Still getting the same error

Rohith-coder1 commented 1 year ago

I tried running the code in a system with GPU, everything work fine, but I am getting a error in Caching test data 0/26245

Rohith-coder1 commented 1 year ago

Screenshot 2022-12-29 181513

UttaranB127 commented 1 year ago

Check the file path. The error simply says that the file path is incorrect. If you don't have the preprocessed dataset, do not set the -dap flag.

UttaranB127 commented 1 year ago

Can you report the error stack trace when running on CPU? I am not able to replicate the error on my machine.

Rohith-coder1 commented 1 year ago

Hii, I fixed the previous error, now I am getting the model not found error, when I download the .pth file from the link which you have given, it comes in .pth.tar file name, but when I extract the file I am not getting a .pth file, it is just a normal folder with file name archive, data.pkl and version.

Is this the correct way of extracting a .pth.tar file? I even tried tried keep the .pth.tar model file in models folder and gave the path in the code, but still its showing same model not found eror

Rohith-coder1 commented 1 year ago

Can you report the error stack trace when running on CPU? I am not able to replicate the error on my machine.

Yeah will re-try once and will update you.

UttaranB127 commented 1 year ago

Hii, I fixed the previous error, now I am getting the model not found error, when I download the .pth file from the link which you have given, it comes in .pth.tar file name, but when I extract the file I am not getting a .pth file, it is just a normal folder with file name archive, data.pkl and version.

Is this the correct way of extracting a .pth.tar file? I even tried tried keep the .pth.tar model file in models folder and gave the path in the code, but still its showing same model not found eror

Keep the .pth.tar file as is, no need to extract anything.
In the directory where you're keeping the .pth.tar file, has the code created a log file (it should create a log file automatically)? If it has no log files, as a quick fix, create an empty log.txt file and keep it there. Then the .pth.tar file should load correctly. Essentially, the code validates the model directory by looking for the presence of the log file.

Rohith-coder1 commented 1 year ago

I tried creating log.txt file but still the model not found error persists

UttaranB127 commented 1 year ago

Can you try debugging the code on your machine to make sure the model path is being read correctly? Can you check which return call of the method get_epoch_and_loss in processor_v2.py (line 53) is getting activated? If you cannot determine any apparent cause for why the model loading should fail, could you please report the full stack trace of the error?

Rohith-coder1 commented 1 year ago

There is a error in caching the test data, the folder is created but the 000000.npz file is not generated.

Rohith-coder1 commented 1 year ago

UttaranB127 commented 1 year ago

Could you please copy-paste the command-line code and the text of the stack trace instead of pasting the screenshot? The text helps me in copy-pasting and save a lot of time when running searches or trying to reproduce the errors.

Rohith-coder1 commented 1 year ago

Command line code : python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c config/multimodal_context_v2.yml --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda --pavi-log --print-log --save-log

Reading data 'data\ted_db\lmdb_test_s2ag_v2_cache_mfcc_14'... Found the cache data\ted_db\lmdb_test_s2ag_v2_cache_mfcc_14_s2ag_v2_cache_mfcc_14 building a language model... loaded from data\ted_db\vocab_models_s2ag\vocab_cache.pkl Total s2ag testing data: 26245 (100.00%) Caching test data 0/26245.Traceback (most recent call last): File "main_v2.py", line 128, in pr = processor.Processor(base_path, args, s2ag_config_args, data_loader, pose_dim, coords, audio_sr) File "C:\Users\sandy1902\Speech2Gestures\speech2affective_gestures\processor_v2.py", line 209, in init self.save_cache('test', test_dir_name) File "C:\Users\sandy1902\Speech2Gestures\speech2affective_gestures\processor_v2.py", line 328, in save_cache vid_indices=vid_indices_all[k]) File "<__array_function__ internals>", line 6, in savez_compressed File "C:\Users\sandy1902\anaconda3\envs\S2D\lib\site-packages\numpy\lib\npyio.py", line 687, in savez_compressed _savez(file, args, kwds, True) File "C:\Users\sandy1902\anaconda3\envs\S2D\lib\site-packages\numpy\lib\npyio.py", line 713, in _savez zipf = zipfile_factory(file, mode="w", compression=compression) File "C:\Users\sandy1902\anaconda3\envs\S2D\lib\site-packages\numpy\lib\npyio.py", line 112, in zipfile_factory return zipfile.ZipFile(file, *args, **kwargs) File "C:\Users\sandy1902\anaconda3\envs\S2D\lib\zipfile.py", line 1240, in init self.fp = io.open(file, filemode) FileNotFoundError: [Errno 2] No such file or directory: 'Speech2Gestures/speech2affective_gestures\data/ted_db\ted_db\npz\test\test\000000.npz'

UttaranB127 commented 1 year ago

I've fixed the pathing issue. Could you try one more time with the new code?

UttaranB127 / speech2affective_gestures

Not able to run the pre-trained model #20