Open Rohith-coder1 opened 1 year ago
Please note that the command line takes named arguments, so you have to write in the format of python main_v2.py --<arg name> <arg val>
yeah I have given it in that format
I re-checked the code now and it runs correctly on my machine. I made a slight change in the path for parse_args.py
, maybe try pulling it and it could help with your issue? Also:
Command: python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c True --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda False --pavi-log False --print-log True --save-log True
Thanks for sharing, I'll take a look when I get some time. Meanwhile, I have checked that the code works without any of the arguments except for --config
(where you have to provide the path to a .yml file), so let me know if that also works for you.
Looking into your command-line call, I noticed a few errors:
Here is a corrected version of your command-line call:
python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c config/multimodal_context_v2.yml --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda --pavi-log --print-log --save-log
Thank you so much. Will execute and update.
I tried running with the updated command, I am trying to run on a Windows Machine with no GPU. I am again getting some PATH issues. I gave my Project and data path correctly in the mainv2.py. Is there any other place where I should give the path.
Yes, please make sure the paths are correct in both the main python file and the .yml config file.
The paths in loader_v2.py come from the .yml file, so please make sure those are accurate.
I changed all the path, now the code is running but I am getting few errors due to CUDA. I made the no-cuda default to True. Is that the correct procedure or is there any place I have to change the CUDA specifications
Try without the multiple GPU flag, so just using a single GPU. The parallelization code may have some issues due to pytorch versioning, which will require separate debugging.
Okay, but I am not having GPU in my system. So the code won't work for systems with NO - GPU?
If you check line 93 in processor_v2.py (previous commit), the code automatically switches to CPU if no GPU is available. I have made this more explicit in the code so it pre-emptively follows the --no-cuda
argument when the argument is present and made a new commit. You can pull the latest changes or just copy lines 93 to 105 in processor_v2.py.
Still getting the same error
I tried running the code in a system with GPU, everything work fine, but I am getting a error in Caching test data 0/26245
Check the file path. The error simply says that the file path is incorrect. If you don't have the preprocessed dataset, do not set the -dap
flag.
Can you report the error stack trace when running on CPU? I am not able to replicate the error on my machine.
Hii, I fixed the previous error, now I am getting the model not found error, when I download the .pth file from the link which you have given, it comes in .pth.tar file name, but when I extract the file I am not getting a .pth file, it is just a normal folder with file name archive, data.pkl and version.
Is this the correct way of extracting a .pth.tar file? I even tried tried keep the .pth.tar model file in models folder and gave the path in the code, but still its showing same model not found eror
Can you report the error stack trace when running on CPU? I am not able to replicate the error on my machine.
Yeah will re-try once and will update you.
Hii, I fixed the previous error, now I am getting the model not found error, when I download the .pth file from the link which you have given, it comes in .pth.tar file name, but when I extract the file I am not getting a .pth file, it is just a normal folder with file name archive, data.pkl and version.
Is this the correct way of extracting a .pth.tar file? I even tried tried keep the .pth.tar model file in models folder and gave the path in the code, but still its showing same model not found eror
I tried creating log.txt file but still the model not found error persists
Can you try debugging the code on your machine to make sure the model path is being read correctly? Can you check which return call of the method get_epoch_and_loss
in processor_v2.py
(line 53) is getting activated? If you cannot determine any apparent cause for why the model loading should fail, could you please report the full stack trace of the error?
There is a error in caching the test data, the folder is created but the 000000.npz file is not generated.
Could you please copy-paste the command-line code and the text of the stack trace instead of pasting the screenshot? The text helps me in copy-pasting and save a lot of time when running searches or trying to reproduce the errors.
Command line code : python main_v2.py --dataset-s2ag ted_db --dataset-test ted_db -c config/multimodal_context_v2.yml --frame-drop 2 --train-s2ag False --use-multiple-gpus T --s2ag-load-last-best True --batch-size 512 --num-worker 4 --s2ag-start-epoch 290 --s2ag-num-epoch 500 --base-tr 1 --step 0.5 --lr-s2ag-decay 0.999 --gradient-clip 0.1 --nesterov True --momentum 0.9 --weight-decay 9.591 --upper-body-weight 1 --affs-reg 0.8 --quat-norm-reg 0.1 --quat-reg 1.2 --recons-reg 1.2 --val-interval 1 --log-interval 200 --save-interval 10 --no-cuda --pavi-log --print-log --save-log
Reading data 'data\ted_db\lmdb_test_s2ag_v2_cache_mfcc_14'...
Found the cache data\ted_db\lmdb_test_s2ag_v2_cache_mfcc_14_s2ag_v2_cache_mfcc_14
building a language model...
loaded from data\ted_db\vocab_models_s2ag\vocab_cache.pkl
Total s2ag testing data: 26245 (100.00%)
Caching test data 0/26245.Traceback (most recent call last):
File "main_v2.py", line 128, in
I've fixed the pathing issue. Could you try one more time with the new code?
Hi sir, I cloned the repo and installed all dependencies but when I am trying to run, it throws error saying unrecognoized arguments