Closed rogercw closed 6 years ago
Could you please try running the simple test, sh local_test.sh
(without modifying anything including renaming the checkpoints)?
The script will simply train the same checkpoint you are using with 10 iterations (you could modify it if you like), eval the model (should return mIOU around 82.20% if number of iterations are small) and visualize some results.
Once you could reproduce the results, we then try renaming the checkpoints and so on.
Thanks for the quick response, @aquariusjay! I can get the expected outcome by running the local_test.sh. It turns out that my PASCAL dataset are somehow polluted, and that is why I got lower value earlier. After switch to the new downloaded dataset, I can get the same value by directly running 'eval.py' as well. Thanks for the help.
As a side note, I needed to make 'download_convert_voc2012.sh' executable and replace sh download_convert_voc2012.sh
with ./download_convert_voc2012.sh
in 'local_test.sh' first, otherwise, I will bump into: download_and_convert_voc2012.sh: 43: download_and_convert_voc2012.sh: Syntax error: "(" unexpected
. Maybe it's only my environment issue though.
@rogercooper76, re: the side note about 'download_convert_voc2012.sh', see #3669
Good job on figuring out the problem and glad to know that the issue is resolved. Closing this issue.
hello, @rogercooper76, when I run the eval.py on cityscapes_dataset, it runs on cpu, and when I run the train.py, it runs on GPUs, can you give me some suggestions to solve the problem?
Hi @wldeephi, multiple GPUs are needed to run both scripts at the same time. For me, I added "CUDA_VISIBLE_DEVICES=$GPUID" to specify which GPU to use. For example, I might run training with "CUDA_VISIBLE_DEVICES=0 python train.py ..." and evaluation with "CUDA_VISIBLE_DEVICES=1 python eval.py ...". Not sure whether it helps.
Hi rogercooper76, after running the eval.py
, I got the result like miou_1.0[1]
, what does that mean? Thanks.
I got the result like miou_1.0[0.935625196],I used python3
@aquariusjay
when i train on ms coco 2014 with mobilenetv2 pretrain model, i input the cropsize [641,641] and [657,657], running eval.py reported the error"tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [predictions
out of bound] [Condition x < y did not hold element-wise:] [x (mean_iou/confusion_matrix/control_dependency_1:0) = ] [0 0 0...] [y (mean_iou/ToInt64_2:0) = ] [81]".I google the method solved this error, but can't find, and find many people are confused by the error. who can tell me the solved method?
When evaluating on coco images, setting eval_crop_size = [641, 641] should resolve the problem. It seems that your model predicts something larger than expected. For COCO, you need to set num_classes=91 in the segmentation_dataset.py.
System information
Describe the problem
I downloaded the pre-trained model 'xception_coco_voc_trainaug' from model zoo, and used it as "checkpoint_dir" for the evaluation. Since there is no 'checkpoint' file included in the tar file, I manually created one with both "model_checkpoint_path" and "all_model_checkpoint_paths" assigned to the downloaded file "model.ckpt" (evaluation will not run without 'checkpoint' file.).
However, after I ran the 'eval.py' with the command in 'local_test.sh', the "miou_1.0" I got is 0.613665, which is way less than the the expected number 82.20%. May I know what I might do wrong here? Thanks.
P.S. I originally planned to post this question in StackOverflow. However, there is no 'deeplab' avaliable yet. and I do not have enough reputation to create it.
Source code / logs
python "${WORK_DIR}"/eval.py \