When I modify the training classes, there are some errors while training (KeyError: 'cat' if self.cache and self.cache_type == "ram":) #1756

Open STRIVESS opened 7 months ago

STRIVESS commented 7 months ago

A few months ago, I successfully trained my custom data with 6 classes and wrote a blog documenting the training process. Recently, I used the same code to modify the training classes and update some training photos. Unfortunately, I encountered the following errors. (train1_errors.log)




Therefore, I use ChatGPT to search for answers and attempted to resolve the issue; however, I encountered the same errors. image

I add below codes

 #Reinitialize class_to_ind attribute with updated classes
  self.target_transform.class_to_ind = {
      cls: idx for idx, cls in enumerate(VOC_CLASSES)

after the code

The completed codes are below:

class VOCDetection(CacheDataset):  
    def __init__(
        image_sets=[("2007", "trainval")],
        img_size=(416, 416),
        self.root = data_dir
        self.image_set = image_sets
        self.img_size = img_size
        self.preproc = preproc
        self.target_transform = target_transform = dataset_name
        self._annopath = os.path.join("%s", "Annotations", "%s.xml")
        self._imgpath = os.path.join("%s", "JPEGImages", "%s.jpg")
        self._classes = VOC_CLASSES
        self.cats = [
            {"id": idx, "name": val} for idx, val in enumerate(VOC_CLASSES)
        self.class_ids = list(range(len(VOC_CLASSES)))
        self.ids = list()
        for (year, name) in image_sets:
            self._year = year
            rootpath = os.path.join(self.root, "VOC" + year)
            for line in open(
                os.path.join(rootpath, "ImageSets", "Main", name + ".txt")
                self.ids.append((rootpath, line.strip()))
        self.num_imgs = len(self.ids)

        # Reinitialize class_to_ind attribute with updated classes
        self.target_transform.class_to_ind = {
            cls: idx for idx, cls in enumerate(VOC_CLASSES)

        self.annotations = self._load_coco_annotations()

        path_filename = [
            (self._imgpath % self.ids[i]).split(self.root + "/")[1]
            for i in range(self.num_imgs)

After analyzing the problem, I have come up with the following guess below:

  1. Because I previously trained using a conda-created PyTorch-GPU virtual environment and installed the training environment for YOLOX, I suspect that during training, it loads the cached content from the previous Anaconda installation.
  2. If I reconfigure the training classes in the next iteration, the new configuration may have a different number of classes than the cached content, resulting in an inability to train properly.
  3. Previously, I trained with six classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3, 'chair': 4, 'cat': 5}. If I change it to only four classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3}, it can train normally. However, when it reaches the 10th epoch, there are evaluation-related errors.(train2_errors.log)"


I would like to request assistance to determine the cause of the problem and find a solution. Thank you for your help!

Currently, my plan is to create a new PyTorch-GPU virtual environment using conda and configure all the necessary dependencies. Then, I will proceed to train the new set of photos.

My virtual environment is below: image image

