Open AndrewZhao opened 8 years ago
I haven't used Python 3. Can you try Python 2.7? I bet it is an issue with Python 3. If you find solution to this, please let me know. Thanks!
@weiliu89 Thanks for reply. Sorry to disturb you again.
I tried to transfer the project to python3, but it doesn't work well...
Then, I changed my environment to python27 and opencv2.4.10 and the ssd_pascal_video.py
works well.
while I set the test_batch_size=8
in the ssd_pascal_video.py
and I got only near 20fps,not 58fps.
My GPU is titan x and cudnn is enable.
Is there anything wrong? Can you give me some suggestions?
Do you compile with cuDNN? I also suggest using cuDNN v5, which is even faster. Also make sure there is no other job running on the same GPU.
@weiliu89 Thanks for reply.
I have compiled with cudnn v5 and my caffe Makefile.config
is
USE_CUDNN := 1
USE_OPENCV := 1
USE_LEVELDB := 1
USE_LMDB := 1
CUDA_DIR := /usr/local/cuda
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_52,code=sm_52 \
-gencode arch=compute_50,code=compute_50\
BLAS := atlas
Do you have some methods to make sure the program uses cudnn v5?
Besides, could you give me some other factors which can make the program run slowly
Thanks again.
another problem is:
(1)The length of my video is 17s.
(2)The fps print on the detection video is around 20fps
(3)While, the time difference print on the terminal is 12s, which means the detection speed is 17/12*30=42fps
I0721 23:32:59.509639 82602 caffe.cpp:252] Running for 536870911 iterations.
I0721 23:33:11.529994 82654 video_data_layer.cpp:119] Finished processing video.
(1) when i run the ssd_pascal_video.py
, it report some red alerms:
I0722 21:07:59.552475 110637 net.cpp:399] data -> data
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
[h264 @ 0x696fc00] too many thread_release_buffer calls!
I0722 21:07:59.829990 110637 video_data_layer.cpp:73] output data size: 8,3,300,300
My openv version is 2.4.10. Could these make the program run slowly?
(2) If i comment the USE_CUDNN=1
in Makefile.config, the training time of minist is 150s.
While if i uncomment the USE_CUDNN=1
in Makefile.config, the traing time of minist is 19s
cudnn v5
file is libcudnn.so.5.0.5
, libcudnn.so.5
and libcudnn.so
That can prove I compile the caffe with cudnn v5.
(3) Running score_ssd_pascal.py
the detection time is
I0722 20:57:25.828037 110577 net.cpp:684] Ignoring source layer mbox_loss
I0722 20:57:26.435151 110577 blocking_queue.cpp:50] Data layer prefetch queue empty
I0722 20:59:19.816159 110577 solver.cpp:531] Test net output #0: detection_eval = 0.721751
That means the detection speed is 4952 / 115 = 43fps?
@weiliu89 Thanks for reply
(1)I have found that stackoverflow page,but it refers to ffmpeg problem. I have followed the instructions to reinstall the opencv and update the h264 and ffmpeg...but the problem still occurs.The page related to h264 too many thread_release_buffer calls!
is few. I only find thisubuntu h264.
So I guess could this problem make the program run slowly...because the frames are not extracted fully.
Sorry to ask this question. I'll fix it by myself.
(2)You mean to say that the minist training time is strange?I'll check it soon.
Thanks for providing these suggestions~
Best regards.
According to your previous comment, it takes about 150s to train MNIST (for certain iterations, I guess), while using cudnn v5, it takes about 19s. The gap seems huge (7.89x faster). It might be true; but I am just surprised by the speedup brought by cudnn :)
@weiliu89 In the new version which contains ssd_pascal_video.py, there is an error when i
import caffe
File "/home/zhaoboya/caffe-ssd/python/caffe/init.py", line 8, in
from .net_spec import layers, params, NetSpec, to_proto
File "/home/zhaoboya/caffe-ssd/python/caffe/net_spec.py", line 244, in
_param_names = param_name_dict()
File "/home/zhaoboya/caffe-ssd/python/caffe/net_spec.py", line 36, in param_name_dict
param_type_names = [type(getattr(layer, s)).name for s in param_names]
File "/home/zhaoboya/caffe-ssd/python/caffe/net_spec.py", line 36, in
param_type_names = [type(getattr(layer, s)).name for s in param_names]
AttributeError: 'LayerParameter' object has no attribute 'transform_param'
But when I change the caffe_pb2.py to the origin version which doesn't contain ssd_pascal_video.py,
import caffe
is ok and the test of mnist is right.I compare the two caffe_pb2.py file, the differences are
serialized_start
,serialized_end
and some code about "VIDEO"my environment is python3, how can I fix this problem...sorry to interrupt you again,thanks