The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Hello, could you please add more details regarding the COCO-Stuff dataset in the data/cocostuff directory?
I assume that you use COCO-Stuff-10k v.1.1, which if downloaded, only has .mat files for the annotations. I acquired the .png labels format by using the transformation found in the mmsegmentation repository. This automatically changes the label <original image name>.mat files to <original image name>_labelTrainIds.png. Nevertheless, this doesn't match the naming convention in the val.lst file. However, even if I account for the different naming convention, the mIoU and the pixel accuracy are far too low (< 1%), which clearly indicates that something is either wrong with the dataset or the model.
Based on that, could you please elaborate further on the dataset creation process you went through?
Hello, could you please add more details regarding the COCO-Stuff dataset in the
data/cocostuff
directory?I assume that you use COCO-Stuff-10k v.1.1, which if downloaded, only has .mat files for the annotations. I acquired the
.png
labels format by using the transformation found in the mmsegmentation repository. This automatically changes the label<original image name>.mat
files to<original image name>_labelTrainIds.png
. Nevertheless, this doesn't match the naming convention in theval.lst
file. However, even if I account for the different naming convention, the mIoU and the pixel accuracy are far too low (< 1%), which clearly indicates that something is either wrong with the dataset or the model.Based on that, could you please elaborate further on the dataset creation process you went through?