Open NaifahNurya opened 5 years ago
It's best to run code in python 2.7
Suggestion it solve the problem,
However, the following problem occur, seems is the problem with the path, but when looking on the folder, the txt file was created. How can this be solved?
Tracking VD in 435m 29s Please wait for ranking! about 180min for VD Ranking VD in 0m 0s Please wait for training action! Needs 200min for 20epochs(VD).
data_confs Namespace(batch_size={'trainval': 300, 'test': 10}, data_type='img', dataset_folder='./dataset/VD\imgs_ranked', label_type='action')
Traceback (most recent call last):
File "GAR.py", line 37, in
I dont know why it create two slashes ( \) instead of one slash \ after VD in the path as shown in attached file which elaborate bolded text above.
When looking the folder; four files created and stored in ./dataset/VD\imgs_ranked\
These file created with the following name;
test_action.txt test_activity.txt trainval_action.txt trainval_activity
Welcome! I guess that it may be casued by the different operation systems. Our code is debugged on Linux, yet did you run it on Windows? The easy way to deal with file paths on Windows, Mac and Linux is shown at https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f But, I highly recommend you choose Ubuntu!
Yes currently i run it on window;
Thank you for the link you recommended to me, i will follow and tr to fix it.
Helow, I got Out of Memory error, Does reducing batch_size in Action_level.py from 10 to 5 will solve the problem? Seems is the prob with my GPU capacity. Or what should I do?
The following is the error RuntimeError: CUDA out of memory. Tried to allocate 345.50 MiB (GPU 0; 8.00 GiB total capacity; 2.77 GiB already allocated; 3.08 GiB free; 429.08 MiB cached)
With the following traceback:
Traceback (most recent call last):
File "GAR.py", line 42, in
For action_level, the test batch_size must be 10, but the trainval batch_size can be reduced. And the trainval batch_size should keep N*10, such as 100, 200.
thank you, So I have to change on Data_Configs.py at line 31 as shown below, from 300 eg to 200
_'batch_size': {'trainval_action': {'trainval': 300, 'test': 10}, 'extract_action_feas': {'trainval': 120, 'test': 120} }_
And this extracted feature should remain as it is,
'extract_action_feas': {'trainval': 120, 'test': 120}
The memory issue is cleared,
But i got the following error after finishing Epoch 9/9 and saving the best accuracy; The error start when start extracting feat; It seems it refer to the weight folder which has action.pkl but in that folder after 9/9 epoch there is best_wts.pkl . Do i need to change line 20 in Action_Level.py so that to refer to best_wts.pkl? _self.net.load_statedict(torch.load('./weights/VD/action/action.pkl'))
Description are shown below
Epoch: 9 phase: trainval Loss: 0.0019305260316276466 Acc: 0.8956937329967403
Running this epoch in 13m 35s
Epoch: 9 phase: test Loss: 0.0629564055343197 Acc: 0.8163081283009146
Running this epoch in 6m 48s
Best test Acc: 0.818176
Training action VD in 209m 46s
Please wait for extracting action_feas!
data_confs Namespace(batch_size={'trainval': 120, 'test': 120}, data_type='img', dataset_folder='./dataset/VD\imgs_ranked', label_type='activity')
AlexNet_LSTM(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(fc): Sequential(
(0): Dropout(p=0.5)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace)
)
(LSTM): LSTM(4096, 3000, batch_first=True)
(classifier): Linear(in_features=3000, out_features=9, bias=True)
)
Traceback (most recent call last):
File "GAR.py", line 50, in
In the folder weights/VD/action/ There is only two files as shown in picture below;
The memory issue is cleared,
But i got the following error after finishing Epoch 9/9 and saving the best accuracy; The error start when start extracting feat; It seems it refer to the weight folder which has action.pkl but in that folder after 9/9 epoch there is best_wts.pkl . Do i need to change line 20 in Action_Level.py so that to refer to best_wts.pkl? _self.net.load_statedict(torch.load('./weights/VD/action/action.pkl'))
Description are shown below
Epoch 9/9
Epoch: 9 phase: trainval Loss: 0.0019305260316276466 Acc: 0.8956937329967403 Running this epoch in 13m 35s Epoch: 9 phase: test Loss: 0.0629564055343197 Acc: 0.8163081283009146 Running this epoch in 6m 48s Best test Acc: 0.818176 Training action VD in 209m 46s Please wait for extracting action_feas! data_confs Namespace(batch_size={'trainval': 120, 'test': 120}, data_type='img', dataset_folder='./dataset/VD\imgs_ranked', label_type='activity') AlexNet_LSTM( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2)) (1): ReLU(inplace) (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (4): ReLU(inplace) (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (7): ReLU(inplace) (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (9): ReLU(inplace) (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace) (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) ) (fc): Sequential( (0): Dropout(p=0.5) (1): Linear(in_features=9216, out_features=4096, bias=True) (2): ReLU(inplace) ) (LSTM): LSTM(4096, 3000, batch_first=True) (classifier): Linear(in_features=3000, out_features=9, bias=True) ) Traceback (most recent call last): File "GAR.py", line 50, in Action.extractFeas() File "C:\Users\GROUP\Desktop\PCTDMGAR\Runtime\Action_Level.py", line 20, in extractFeas self.net.load_state_dict(torch.load('./weights/VD/action/action.pkl')) File "C:\Users\GROUP\Anaconda3\envs\GAR\lib\site-packages\torch\serialization.py", line 366, in load f = open(f, 'rb') FileNotFoundError: [Errno 2] No such file or directory: './weights/VD/action/action.pkl'
In the folder weights/VD/action/ There is only two files as shown in picture below;
Yes, you're right. And I forgot to fix it.
What is the recommended size of the RAM? Because i got MemoryError at Action.extractFeas() stage even if after reducing extract_action_feas in Data_config.py file at line 31 from 120 to 40 in trainval and test.
'batch_size': {'trainval_action': {'trainval': 100, 'test': 10}, 'extract_action_feas': {'trainval': 40, 'test': 40} }
I run my experiment on a PC with RAM of 8GB.
The error is is as shown below;
Epoch: 9 phase: trainval Loss: 0.0023482153126937293 Acc: 0.9154260924977329 Running this epoch in 13m 53s Epoch: 9 phase: test Loss: 0.0678530701597602 Acc: 0.8146335179698571 Running this epoch in 6m 53s Best test Acc: 0.819335 Training action VD in 212m 9s Please wait for extracting action_feas! data_confs Namespace(batch_size={'trainval': 40, 'test': 40}, data_type='img', dataset_folder='./dataset/VD/imgs_ranked', label_type='activity') AlexNet_LSTM( (features): Sequential( (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2)) (1): ReLU(inplace) (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2)) (4): ReLU(inplace) (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (7): ReLU(inplace) (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (9): ReLU(inplace) (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (11): ReLU(inplace) (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False) ) (fc): Sequential( (0): Dropout(p=0.5) (1): Linear(in_features=9216, out_features=4096, bias=True) (2): ReLU(inplace) ) (LSTM): LSTM(4096, 3000, batch_first=True) (classifier): Linear(in_features=3000, out_features=9, bias=True) ) trainval 34930.0
Traceback (most recent call last):
File "GAR.py", line 50, in
Also if you have any suggestion on this reported error .
I'm sorry, It may be caused by numpy array, the ‘trainval.npy’ is larger than 22G. Thus, the creation of this matrix may take such large memory. I'm sorry I didn't consider the limitation of RAM. And, the '.npy' does not support the append mode, there are the following solutions:
To facilitate study follow-up, it is recommended to increase the memory capacity.
PS: You can comment on the codes before extract_fea for reducing time, there is no need to train action_level again!
Thank you very much for your quick reply, and recommendations.
Another issue found,
It seems the file trainval.npy are not created as expected by line 38 (np.save(filename, feas)) in Action_Level .py The error is below:
FileNotFoundError: [Errno 2] No such file or directory: 'dataset\VD\feas\activity\trainval.npy'
Description of the stage and traceback can be shown below;
Please wait for extracting action_feas!
data_confs Namespace(batch_size={'trainval': 40, 'test': 40}, data_type='img', dataset_folder='.\dataset\VD\imgs_ranked', label_type='activity')
AlexNet_LSTM(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(fc): Sequential(
(0): Dropout(p=0.5)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace)
)
(LSTM): LSTM(4096, 3000, batch_first=True)
(classifier): Linear(in_features=3000, out_features=9, bias=True)
)
trainval 34930.0
Traceback (most recent call last):
File "GAR.py", line 50, in
1) make sure that you execute 'python GAR.py' at C:\Users\GROUP\Desktop\PCTDMGAR\, 2) what does the line 36 in 'Action_Level.py' print out? Please check carefully for the creation of this file.
Thank you, I did a mistake, now is extracting feature.
I got the following error, when it finish step 0, through running GAR.py
Please wait for tracking! about 240min for VD the person imgs are saved at ./dataset/VD\imgs trainval_videos: [0, 1, 2, 3, 6, 7, 8, 10, 12, 13, 15, 16, 17, 18, 19, 22, 23, 24, 26, 27, 28, 30, 31, 32, 33, 36, 38, 39, 40, 41, 42, 46, 48, 49, 50, 51, 52, 53, 54] test_videos: [4, 5, 9, 11, 14, 20, 21, 25, 29, 34, 35, 37, 43, 44, 45, 47] Traceback (most recent call last): File "GAR.py", line 23, in
Pre.Processing(opt.dataset_root, opt.dataset_name, 'track')
File "C:\Users\GROUP\Desktop\PCTDMGAR\Pre\Processing.py", line 27, in init
eval(self.datasetname + '' + str.capitalize(operation))(self.dataset_root, dataset_confs, model_confs)
File "C:\Users\GROUP\Desktop\PCTDMGAR\Pre\VD_Track.py", line 14, in init
self.getTrainTest()
File "C:\Users\GROUP\Desktop\PCTDMGAR\Pre\VD_Track.py", line 62, in getTrainTest
for i in xrange(self.num_players*self.num_frames):
NameError: name 'xrange' is not defined
It seem the error is from
*VD_Track.py", line 62, in getTrainTest for i in xrange(self.num_playersself.num_frames): NameError: name 'xrange' is not defined**