Reproduce HRNet-OCR benchmark

xyaoab commented 3 years ago

May I know how to reproduce the results of HRNet-OCR benchmark? With the model you uploaded(lr=1e-2), on the validation set, class iou is pretty low ([0. 0.39814434 0.42628074 0. 0. 0.94645624 0. 0. 0. 0. 0. 0. 0. 0.09216522 0. 0. 0. 0.45838588 0.]). Any suggestions?

maskjp commented 3 years ago

Hi, @xyaoab,

Thank you for your interest in RELLIS-3D! If you used the provided script, then the saved prediction id is from 0 to num_classes. When you compute the iou, you may need to use this function to convert the label into the original id based on this mapping.

Hope this helps!

Best wishes!

Peng Jiang

jdonovanCS commented 3 years ago

I am also having this trouble, but I will try the conversion you mentioned above and report back. Additionally, I was trying to run the Evaluate_img python notebook, but the folder structure of the dataset after I downloaded it does not match up. Maybe I downloaded the wrong zip file. I also didn't know if there was a way to output the predictions to load into that python notebook. I'm going to continue looking. My thought is maybe I just skipped some steps in my haste.

maskjp commented 3 years ago

I am also having this trouble, but I will try the conversion you mentioned above and report back. Additionally, I was trying to run the Evaluate_img python notebook, but the folder structure of the dataset after I downloaded it does not match up. Maybe I downloaded the wrong zip file. I also didn't know if there was a way to output the predictions to load into that python notebook. I'm going to continue looking. My thought is maybe I just skipped some steps in my haste.

Hi, @jdonovanCS,

This is an old issue. We changed the code and it can evaluate the label without conversion.

On the other hand, what folder structure did you get? Can you tell me more detail?

Best wishes!

jdonovanCS commented 3 years ago

I think I actually figured out some of what was going on. After I downloaded the Full Images and the Full Images ID Format separately. I just needed to copy the Rellis-3D folder from one into the other and that made it correct. The issue I'm facing now is two-fold. 1) I keep running out of CUDA Memory even though I have a NVIDIA Mobile 1070 which has at least as much GPU RAM as the 1050 TI that was used by the project (I believe) and 2) I can't figure out how to produce actual prediction images to point the Evaluate_img python notebook at.

maskjp commented 3 years ago

I think I actually figured out some of what was going on. After I downloaded the Full Images and the Full Images ID Format separately. I just needed to copy the Rellis-3D folder from one into the other, and that made it correct. The issue I'm facing now is two-fold. 1) I keep running out of CUDA Memory even though I have a NVIDIA Mobile 1070 which has at least as much GPU RAM as the 1050 TI that was used by the project (I believe) and 2) I can't figure out how to produce actual prediction images to point the Evaluate_img python notebook at.

Hi, @jdonovanCS,

Try to reduce the batch size and input image size in the configuration file. As I show in the previous answer, the input size was reduced to 512x512 in order to perform the inference on a 1050 Ti. Besides, the results in the paper are from evaluation with full-size inputs. So you might get some performance decrease after you reduce the input size.

The models were trained on a more powerful workstation with two GPUs. We tested inference on 1050Ti because we want to see whether we can use the model on a real robot with a small GPU.

jdonovanCS commented 3 years ago

Oh ok. Cool. I misunderstood. Thank you!

unmannedlab / RELLIS-3D

Reproduce HRNet-OCR benchmark #4