From Coco annotation json to semantic segmentation image like VOC's .png in pytorch

I am trying to use COCO 2014 data for semantic segmentation training in PyTorch. I have a PSPNet model with a Cross Entropy loss function that worked perfectly on PASCAL VOC dataset from 2012. Now I am trying to use a portion of COCO pictures to do the same process. But Coco has json data instead of .png images for annotation and I somehow have to covert one to the other. I have noticed that there is annToMask in pycocotools, but I cannot quiet figure out how to use that function in my case.

right now, I have dataloader that kind of looks this

`def len(self): return len(self.id_list)

def __getitem__(self, index):
    raw_img, anno_class_img = self.pull_item(index)
    return raw_img, anno_class_img

def pull_item(self, index):
    coco = self.coco
    img = coco.loadImgs(self.id_list[index])[0]
    image_file_path = "./{}2014-2/{}".format(self.phase, img["file_name"])
    raw_img = Image.open(image_file_path)
    raw_img = raw_img.convert('RGB')   

    cat_ids = coco.getCatIds() 
    anns_ids = coco.getAnnIds(imgIds=img['id'], catIds=cat_ids, iscrowd=None)
    anns = coco.loadAnns(anns_ids)

    mask = coco.annToMask(anns[0])
    for i in range(len(anns)):
        mask += coco.annToMask(anns[i])
    anns_img = Image.fromarray(mask)

    raw_img = self.transform(raw_img)
    anns_img = self.transform(anns_img)
    return raw_img, anns_img

and in my main training function I calculate the loss with this

for images, labels in dataloaders_dict[phase]:
                if images.size()[0] == 1:
                    continue
                images = images.to(device)
                labels = torch.squeeze(labels)
                labels = labels.to(device)
                if (phase == 'train') and (count == 0):
                    optimizer.step()
                    optimizer.zero_grad()

                with torch.set_grad_enabled(phase == 'train'):
                    outputs = net(images)
                    loss = criterion(outputs, labels.long())

When I start feeding my data, the first iteration is about 28% loss, then the next iteration is around 2% loss and after that, it gets down to almost zero and basically keeps scoring 100% accuracy before even getting one epoch. Please let me know how to fix this.

cocodataset / cocoapi

From Coco annotation json to semantic segmentation image like VOC's .png in pytorch #403