Open geekvc opened 7 years ago
Hi,
I've pushed a fix for this issue https://github.com/farrajota/fast-rcnn-torch/commit/0a286bce56f36d0200147f503b9108df42bd55dc. It was due to tensor type mismatch in the fastrcnn
package. It should work properly now. Do a git pull
and luarocks make rocks/*
on the fastrcnn
repo to get the fix.
Thank you very much. After pulled the fastrcnn repo, it is quite work properly now! The test only given the final accuracy of each category and the mAP. I want to know whether it is possible to save the result of each image, such the coordinates of the bounding boxes, and the category information of the bouding box? By this we can selectively visualize some detection results.
The demo.lua
file does some interactive visualizations on random images but it doesn't store the results to disk. You can use the demo code and save the scores and boxes to a file. Basically you just have to do something like torch.save(filename, {scores, bboxes})
after this line.
That's great, thank you very much!
I retrained the vgg16
network with the coco dataset, and get the final model, and I tested the coco dataset detection accuracy with the coco
test mode. When the test finished, It sames that not given the final result with the coco evaluation metric. Is it because I don't have enough memory?
test: 40500/40504 dev: 1, forward time: 0.857, select time: 0.758s, nms time: 0.758s, total time: 1.636s
test: 40501/40504 dev: 1, forward time: 0.498, select time: 0.677s, nms time: 0.677s, total time: 1.199s
test: 40502/40504 dev: 1, forward time: 0.477, select time: 0.638s, nms time: 0.638s, total time: 1.157s
test: 40503/40504 dev: 1, forward time: 0.801, select time: 0.660s, nms time: 0.660s, total time: 1.492s
test: 40504/40504 dev: 1, forward time: 0.877, select time: 1.365s, nms time: 1.365s, total time: 2.265s
*********************************************
*** COCO evaluation metric
*********************************************
Loading files to calculate sizes...
Total boxes: 4065662
Loading files to create giant tensor...
Converting data tensor to table format (to save as a .json file)...
/home/wangty/torch/install/bin/luajit: not enough memory===>.] ETA: 252ms | Step: 0ms
Thank you very much.
I'll take a look into it. It looks like luajit is hitting its memory limits (2gb) when creating the .json
file. If so, then I'll create a workaround for it.
It takes some time to complete the testing script on the coco
dataset, so when its done I'll take a look to see if this is really the issue and post a fix.
Ok, it is quite a time consuming work to test all the 40504 images, and that's why I want to make a samll coco dataset, such as coco5K to reduce hours of test in the experiment. Thank you very much for your consideration.
I've committed some fixes to solve this issue. This took me a while to fix because the coco
dataset has a lot of images and I didn't have enough ram in my machine to store the processed data so I had to improvise. Having said this, now it should work without having issues with the luajit's memory limit. This was the problem when saving the results to a .json file.
Moreover, I've set a new flag named frcnn_test_use_cache
which enables to store results on disk and thus reducing the usage of memory used when testing the dataset. By default it is set to use the ram memory (faster), but you can set it to 'true' and use the disk to cache the results along the way (slower).
To get this fixes you'll need to do the following:
git pull
for this repo;fastrcnn
package by doing a git pull
and luarocks make rocks/*
;pip install dbcollection
or conda install -c farrajota dbcollection
It is very kind of you! I will try it immediately. Thank you very much!
I updated this repo and the fastrcnn package, after that re-install the fastrcnn and dbcollection package, when I ran the test.lua with the config as follows:
opt.expID = 'frcnn_vgg16_coco'
opt.dataset = 'coco'
opt.GPU = 2
opt.netType = 'vgg16'
opt.frcnn_test_mode = 'coco'
and used the flag frcnn_test_use_cache
is true, an error occured. I tested several times, It seems like there a type conversion error in fastrcnn package.
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/coco/frcnn_vgg16_coco/model_final.t7
==> (5/5) Test Fast-RCNN model
Saving temporary files to: /mnt/geekvc/fastrcnn-example-torch/Tester_Eval
/home/wangty/torch/install/bin/luajit: ...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:72: bad argument #2
to 'getFilename' (string expected, got table)
stack traceback:
[C]: in function 'getFilename'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:72: in function 'getImage'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:114: in function 'testOne'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:236: in function 'test_use_cache'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:278: in function 'test'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/test.lua:34: in function 'test'
test.lua:73: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
I tested with anaconda python 3 and python 2.7.
I'm not able to reproduce this issue, although I've pushed a commit to fish a small issue regarding data fetching, so its best to do a git pull
on this repo. Also, I don't recommend setting the GPU
id to anything but GPU=1
, because it has been an issue in torch (at least for me) for a long time, and it's best to select which gpus you want to use for running the script by setting the CUDA_VISIBLE_DEVICES
flag like this:
# select the gpus 1 and 2
CUDA_VISIBLE_DEVICES=0,1 th test.lua
or you want to select different gpus from your cluster in a particular order then
# select the gpus 3 and 1
CUDA_VISIBLE_DEVICES=2, 0 th test.lua
This is the recommended way to select GPUs for your script.
Oh, that's a good idea to use the CUDA_VISIBLE_DEVICES
flag to select GPU device!
When I test with the voc2007 dataset and voc model. This error still occurred. Is there something wrong with my fastrcnn package installation?
$ CUDA_VISIBLE_DEVICES=1 th test.lua
==> (1/5) Load options
==> (2/5) Load dataset data loader
==> (3/5) Load roi proposals data
==> (4/5) Load model: /home/wangty/geekvc/fastrcnn-example-torch/data/exp/pascal_voc_2007/vgg16/model_final.t7
==> (5/5) Test Fast-RCNN model
/home/wangty/torch/install/bin/luajit: ...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:72: bad argument #2 to 'getFilename' (string expected, got table)
stack traceback:
[C]: in function 'getFilename'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:72: in function 'getImage'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:114: in function 'testOne'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:208: in function 'test_no_cache'
...e/wangty/torch/install/share/lua/5.1/fastrcnn/Tester.lua:280: in function 'test'
/home/wangty/torch/install/share/lua/5.1/fastrcnn/test.lua:34: in function 'test'
test.lua:58: in main chunk
[C]: in function 'dofile'
...ngty/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00406620
I've built my fastrcnn
and dbcollection
Lua packages and I did not get any issues. I believe the issue in your case is the dbcollection
installation for lua being an older version, and for the new code to work properly you need to install the newest version. Just to be sure do the following:
fastrcnn
and dbcollection
from luarocks;luarocks remove fastrcnn
luarocks remove dbcollection
git clone https://github.com/farrajota/fast-rcnn-torch
git clone https://github.com/dbcollection/dbcollection-torch7
cd fast-rcnn-torch && luarocks make rocks/*
cd dbcollection-torch7 && luarocks make
git pull
in the fastrcnn-example-torch
dir to have the latest changes to the code.After these steps try to see if you still get the same error as before.
Ok, I did as you told me, the voc and coco dataset testing goes well, that's great! Thank you very much!
Hi, farrajota I trained and tested the fastrcnn with voc dataset, everything goes well. I trained the fastrcnn in coco dataset with no error, when I tested the accuracy with the trained model, error occured:
-netType vgg16.