Open xvbw opened 8 years ago
Hi @xvbw,
It seems like you want to load ground truth _img_height == segheight (315 vs. 0) but it doesn't load correctly. If by testing you mean just segment image with your trained model you should modify layer where you load images/ground truth. If you want to test accuracy of your trained model, check why segmentation ground truths don't load correctly.
Cheers, Martin
I was having the same issue for the same network. The issue is that it is looking for the ground truth data and cannot find it but since this is the test set there is none (look at the list files for test verse val). It is set like this because the output of this network is the accuracy (which requires ground truth) and not writing the mat file.
To change this, you need to uncomment the layer_type : NONE
and comment layer_type: PIXEL
(so it stops looking for ground truth). Then you need to comment out the accuracy layer (lines: 21825-21835) and uncomment the fc1_mat layer (lines: 21809-21823).
You might also need to create folders for the output since the output is called fc1 and not fc8.
Hope this helps, Chris
Yeah! exactly. Actually, I solved this by changing layer_type: PIXEL
to layer_type: NONE
as you already mentioned.
Thanks for the reply
Did you guys run into this error when running deeplabv2-resnet101 Check failed: matfp Error creating MAT file voc12/features/deeplabv2_resnet101/test/fc1/2008_000006_blob_0.mat
?
Make sure the folders voc12/features/deeplabv2_resnet101/test/fc1/
are created (this isn't created by default unless you modify the run script).
Hi, @xvbw ,
Did you run the training with cuDNN support? Further, If you work with cuDNN, which cuDNN version did you use? I have successfully compile the caffe with cuDNN. But when I try to run the training, I encounter the cuda success error.
Thanks.
I figured out this problem. I compile the caffe use CMake. The CMakeLists.txt
originally is:
set(CUDA_NVCC_FLAGS ${CUDA_NVCC_FLAGS}
-gencode arch=compute_20,code=sm_20
-gencode arch=compute_20,code=sm_21
-gencode arch=compute_30,code=sm_30
-gencode arch=compute_35,code=sm_35
)
I added following message:
-gencode arch=compute_50,code=sm_50
-gencode arch=compute_50,code=compute_50
And it worked to me.
@ksnzh I used CuDNN v4.
You may have a GPU that is newer than compute_35
. That's the reason it fails to run training with CuDNN.
Glad you fixed it.
@xvbw Yes, I use a Titan X and I changed the FLAG to compute_50.
By the way, it trained successfully in VOC2012 original dataset. But in the augmented VOC dataset, some files cannot be found. Why? Should I manually fix the train_aug.txt
?
@ksnzh Can you name the missing files so that I can check if I have that file in my augmented VOC dataset.
I have not trained on augmented VOC dataset so I may not be able to help you but I don't think you need to manually fix train_aug.txt
file
Also, make sure you have properly downloaded augmented VOC before.
@xvbw
I1220 16:00:51.012964 8727 caffe.cpp:118] Finetuning from /media/ksnzh/DATA/deeplab/train-
DeepLab/exper/voc12/model/DeepLab-LargeFOV/train_iter_6000.caffemodel
E1220 16:00:51.019429 8755 io.cpp:76] Could not open or find file /media/ksnzh/DATA/deepla
b/train-DeepLab/exper/voc12/data/images_aug/2007_006560.jpg
I1220 16:00:51.019639 8755 image_seg_data_layer.cpp:180] Fail to load img: /media/ksnzh/DA
TA/deeplab/train-DeepLab/exper/voc12/data/images_aug/2007_006560.jpg
E1220 16:00:51.019670 8755 io.cpp:76] Could not open or find file /media/ksnzh/DATA/deepla
b/train-DeepLab/exper/voc12/data/labels_aug/2007_006560.png
I1220 16:00:51.019682 8755 image_seg_data_layer.cpp:186] Fail to load seg: /media/ksnzh/DA
TA/deeplab/train-DeepLab/exper/voc12/data/labels_aug/2007_006560.png
F1220 16:00:51.019707 8755 data_transformer.cpp:331] Check failed: img_channels == data_ch
annels (1 vs. 3)
Because the train set is shuffled, the missing file each time is different. It seems 2007_006560
is in my trainval_aug.txt
, but it does not exists in my images_aug
folder.
2007_006560
is from original VOC dataset. Not the augmented VOC dataset.
You need to change the path of each original file or you can just merge original and augmented VOC dataset in the same folder.
I recommend merging the folders since it's easier.
@xvbw It means that augmented VOC is the union of the downloaded benchmark and original voc2012?
That error occurs because the path of 2007_006560
is wrong. train_val.txt
includes file lists of both original and augmented. But original and augmented VOC dataset are in the separate folder. So, you need to either modify the path of train_val.txt
or just merge two folders(original and augmented) in the same folder. What I did is merging folders because it's just easy and simple.
Just make sure the path of 'train_val.txt' is correct. This is the main point.
Hope this helps.
can you please mention the command that is used for testing ?
Hi, Can you please tell me what is the difference between fc_8 features and crf features. There are different trained models some has fc_8 in the test.prototxt some has crf.
I am using Deeplab to generate CRF features for my test images which I can use for my CRF. I have used ResNet-101 trained model with 1 image It got crashed by giving following output
I0524 07:20:19.491786 3885 net.cpp:816] Ignoring source layer label_shrink16_label_shrink16_0_split I0524 07:20:19.491788 3885 net.cpp:816] Ignoring source layer loss_res05 I0524 07:20:19.491793 3885 net.cpp:816] Ignoring source layer accuracyres05 I0524 07:20:19.500727 3885 caffe.cpp:252] Running for 1 iterations. F0524 07:20:19.748760 3885 blob.cpp:163] Check failed: data Check failure stack trace: @ 0x7f3d8340b5cd google::LogMessage::Fail() @ 0x7f3d8340d433 google::LogMessage::SendToLog() @ 0x7f3d8340b15b google::LogMessage::Flush() @ 0x7f3d8340de1e google::LogMessageFatal::~LogMessageFatal() @ 0x7f3d8398a15b caffe::Blob<>::mutable_cpu_data() @ 0x7f3d83800727 caffe::BatchNormLayer<>::Forward_cpu() @ 0x7f3d839348a3 caffe::Net<>::ForwardFromTo() @ 0x7f3d83934b17 caffe::Net<>::ForwardPrefilled() @ 0x4088c1 test() @ 0x407010 main @ 0x7f3d8269a830 __libc_start_main @ 0x4076c9 _start @ (nil) (unknown) Aborted (core dumped)
Dinesh
I tested resnet-101 successfully, but it's not the same as the author's miou. The result before densecrf on the validation set was 0.7646 and the paper was 0.7635. Strangely, the miou I ran out of VGG-16 was consistent with the paper. Does anyone have the same problem?
Hi, Thanks for your clear explanation, I could do training successfully. However, I really can't do testing phase on my own. The error always occur and I can't further proceed. The error is shown below.
The model used here is DeepLab v2 ResNet101(http://liangchiehchen.com/projects/DeepLabv2_resnet.html). The other models work perfectly but I always get this error for only this model even if the settings are all kept same.
Do you have any ideas on this? That would be really greatful. Thanks