Open manza-ari opened 6 years ago
try python3
Thank you so for reply.
Python3 how and where?
I am using ubuntu 16 and I have python3. But didn't get your answer.
On Tue, Jul 24, 2018, 14:01 dongzhuoyao notifications@github.com wrote:
try python3
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/chaoyuaw/pytorch-coviar/issues/6#issuecomment-407281732, or mute the thread https://github.com/notifications/unsubscribe-auth/AgJ3EWtpm5ilQ-qzx3m01m3EHPhGxSwaks5uJqoMgaJpZM4VcEMN .
please make sure you are using python3, in my machine, if I use python2, it shows the same error as you.
after switching to python3, everything is ok now.
OK thank you so much for your reply. I try.
On Tue, Jul 24, 2018, 14:26 dongzhuoyao notifications@github.com wrote:
please make sure you are using python3, in my machine, if I use python2, it shows the same error as you.
after switching to python3, everything is ok now.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/chaoyuaw/pytorch-coviar/issues/6#issuecomment-407285416, or mute the thread https://github.com/notifications/unsubscribe-auth/AgJ3EQIPT1jejGe_HLsMem4Glr7ilxBDks5uJrALgaJpZM4VcEMN .
Where exactly you want me to use python3? I am new in this area. I have not used any python command in the entire procedure. Please elaborate?
Thanks, @dongzhuoyao!
Hi @kanza-ali , In install.sh, line 2 and line 3 call python. Could you please try replacing "python" by "python3" in install.sh?
Thank you for your reply.
I did changes and run install.sh again. but I have this output.
Could you please elaborate on what your question is on this output? Those warnings are expected, and it looks like it succeeded :)
OK Thank you so so much!
I run the train.py
and output is following:
I have some error in the end. What is this about. Cannot find anything on the internet.
how many GPU do you use?
for any of the gpu numbers this gives this error:
raise AssertionError("Invalid device id") AssertionError: Invalid device id
Either I run the first training code for UCF or HMDB
python3 train.py --lr 0.0003 --batch-size 40 --arch resnet152 \ --data-name hmdb51 --representation iframe \ --data-root data/hmdb51/mpeg4_videos \ --train-list data/datalists/hmdb51_split1_train.txt \ --test-list data/datalists/hmdb51_split1_test.txt \ --model-prefix hmdb51_iframe_model \ --lr-steps 55 110 165 --epochs 220 \ --gpus 0 1
OR
python3 train.py --lr 0.0003 --batch-size 80 --arch resnet152 \ --data-name ucf101 --representation iframe \ --data-root data/ucf101/mpeg4_videos \ --train-list data/datalists/ucf101_split1_train.txt \ --test-list data/datalists/ucf101_split1_test.txt \ --model-prefix ucf101_iframe_model \ --lr-steps 150 270 390 --epochs 510 \ --gpus 0 1 2 3
Did you do following changes before running the training lines given under USAGE heading:
from coviar import load load([input], [gop_index], [frame_index], [representation_type], [accumulate])
Hi @dongzhuoyao ,
How much time your training is taking for this project? Training of only one dataset like (UCF 101).
I have run only this with 8 GPUs:
python3 train.py --lr 0.0003 --batch-size 80 --arch resnet152 \ --data-name ucf101 --representation iframe \ --data-root data/ucf101/mpeg4_videos \ --train-list data/datalists/ucf101_split1_train.txt \ --test-list data/datalists/ucf101_split1_test.txt \ --model-prefix ucf101_iframe_model \ --lr-steps 150 270 390 --epochs 510 \ --gpus 0 1 2 3 4 5 6 7
Hi @dongzhuoyao ,
How much time your training is taking for this project? Training of only one dataset like (UCF 101).
I have run only this with 8 GPUs:
python3 train.py --lr 0.0003 --batch-size 80 --arch resnet152 --data-name ucf101 --representation iframe --data-root data/ucf101/mpeg4_videos --train-list data/datalists/ucf101_split1_train.txt --test-list data/datalists/ucf101_split1_test.txt --model-prefix ucf101_iframe_model --lr-steps 150 270 390 --epochs 510 --gpus 0 1 2 3 4 5 6 7
Hi, I met with the same problem. AssertionError.
I'm a new in this area, my computer has 1 GPU, please tell me what parameters I should change?Thank you!
Hi @JGyoung33
First try to train motion vector and residual which has resnet18 arch with less batch size, if they are working and giving you result with 1 GPU that means everything is fine and you need server support to train for iframes for arch resnet152. If not then share your error.
Hi @JGyoung33
First try to train motion vector and residual which has resnet18 arch with less batch size, if they are working and giving you result with 1 GPU that means everything is fine and you need server support to train for iframes for arch resnet152. If not then share your error.
Hi, thank you for replying. Now, I have a new error. I trained iframe for the beginning, and since I only have 1 GPU, I set GPUS with 0. Now there are new errors and it seems to be related with pytorch's function. I don't find how to resolve it.
When I trained iframe at another computer with two GPUs, it can worked but it would stop in the training process. there is another error called "cuda: out of memory".Although I reduced the batchsize ,it cannot work.
What is memory size for each GPU, you are using for calculating iframes? What batchsize you are giving? I suggest you first try calculating MV and residuals.
Hi, @kanza-ali
I use 1080ti to train iframes and I set batchsize at 5, it also cannot worked, and after datasets are augmented , it would print "Could not open input stream" like below, but it can continue.
When I trained MV, some video would be failed to decode, like this:
I think these errors is related with my pytorch dataloader such as num_worker, but I am not familar with it.
is your FFmpeg is working?
I have never experienced such errors while doing this project. BTW I also have baby experience in this area.
is your FFmpeg is working?
I have never experienced such errors while doing this project. BTW I also have baby experience in this area.
I think it's working, since I use it to produce mpeg4-format videos.
is your FFmpeg is working?
I have never experienced such errors while doing this project. BTW I also have baby experience in this area.
Now I uninstall ffmpeg, and it would be the same errors, it seems uninstalling ffmpeg don't affect other programs, it only works in the process which transforms raw format to mpeg4 format, is it right?
BTW, I compile ffmpeg using gcc -4.8, is the version too old? Looking forward to your reply, thank you~
I am using Ubuntu 16 and I have installed ffmpeg version N-90418-g74c6a6d built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10)
Hi, @kanza-ali
I upgrade my gcc to 6.0 version and now I can train iframe, thank you. But in the training process(until 2nd epoch), there will be another error. If I train in the command line, it would remind me "Cuda: out of memory". In comparison, if I trained iframes in Vs code debug mode, it will appear below:
Thanks for updates. For calculating iframes for 80 batch-size, I have used 4 GPUs with following changes in train.py
model=torch.nn.DataParallel(model,device_ids=range(torch.cuda.device_count())) model.cuda()
The above changes also solve your "Cuda: out of memory" error.
Sorry, I cannot comment on the error you shared, as I have already told you about my baby experience in this area. you can share your error on Pytorch forum.
@kanza-ali
Anyway, thank you. if I have some updates, I must share with you, thank you again.
Thanks for updates. For calculating iframes for 80 batch-size, I have used 4 GPUs with following changes in train.py
model=torch.nn.DataParallel(model,device_ids=range(torch.cuda.device_count())) model.cuda()
The above changes also solve your "Cuda: out of memory" error.
Sorry, I cannot comment on the error you shared, as I have already told you about my baby experience in this area. you can share your error on Pytorch forum.
HI. Can you tell me your Opencv version?
My version is 3.1
My version is 3.1
Thank you~BTW, what is your CUDA and cuDNN version? I guess it may be related to my failure.
CUDA 8
Don't worry, keep trying.
I have the same question. I have try python3, but it doesn't work.
Hi Sir,
Kindly check this error? how can I resolve this?