YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Apache License 2.0
9.35k
stars
2.19k
forks
source link
When I modify the training classes, there are some errors while training (KeyError: 'cat' if self.cache and self.cache_type == "ram":) #1756
A few months ago, I successfully trained my custom data with 6 classes and wrote a blog documenting the training process. Recently, I used the same code to modify the training classes and update some training photos. Unfortunately, I encountered the following errors.
(train1_errors.log)
Therefore, I use ChatGPT to search for answers and attempted to resolve the issue; however, I encountered the same errors.
I add below codes
#Reinitialize class_to_ind attribute with updated classes
self.target_transform.class_to_ind = {
cls: idx for idx, cls in enumerate(VOC_CLASSES)
}
class VOCDetection(CacheDataset):
def __init__(
self,
data_dir,
image_sets=[("2007", "trainval")],
img_size=(416, 416),
preproc=None,
target_transform=AnnotationTransform(),
dataset_name="VOC0712",
cache=False,
cache_type="ram",
):
self.root = data_dir
self.image_set = image_sets
self.img_size = img_size
self.preproc = preproc
self.target_transform = target_transform
self.name = dataset_name
self._annopath = os.path.join("%s", "Annotations", "%s.xml")
self._imgpath = os.path.join("%s", "JPEGImages", "%s.jpg")
self._classes = VOC_CLASSES
self.cats = [
{"id": idx, "name": val} for idx, val in enumerate(VOC_CLASSES)
]
self.class_ids = list(range(len(VOC_CLASSES)))
self.ids = list()
for (year, name) in image_sets:
self._year = year
rootpath = os.path.join(self.root, "VOC" + year)
for line in open(
os.path.join(rootpath, "ImageSets", "Main", name + ".txt")
):
self.ids.append((rootpath, line.strip()))
self.num_imgs = len(self.ids)
# Reinitialize class_to_ind attribute with updated classes
self.target_transform.class_to_ind = {
cls: idx for idx, cls in enumerate(VOC_CLASSES)
}
self.annotations = self._load_coco_annotations()
path_filename = [
(self._imgpath % self.ids[i]).split(self.root + "/")[1]
for i in range(self.num_imgs)
]
super().__init__(
input_dimension=img_size,
num_imgs=self.num_imgs,
data_dir=self.root,
cache_dir_name=f"cache_{self.name}",
path_filename=path_filename,
cache=cache,
cache_type=cache_type
)
After analyzing the problem, I have come up with the following guess below:
Because I previously trained using a conda-created PyTorch-GPU virtual environment and installed the training environment for YOLOX, I suspect that during training, it loads the cached content from the previous Anaconda installation.
If I reconfigure the training classes in the next iteration, the new configuration may have a different number of classes than the cached content, resulting in an inability to train properly.
Previously, I trained with six classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3, 'chair': 4, 'cat': 5}. If I change it to only four classes: {'ball': 0, 'person': 1, 'dog': 2, 'animal faeces': 3}, it can train normally. However, when it reaches the 10th epoch, there are
evaluation-related errors.(train2_errors.log)"
I would like to request assistance to determine the cause of the problem and find a solution. Thank you for your help!
Currently, my plan is to create a new PyTorch-GPU virtual environment using conda and configure all the necessary dependencies. Then, I will proceed to train the new set of photos.
A few months ago, I successfully trained my custom data with 6 classes and wrote a blog documenting the training process. Recently, I used the same code to modify the training classes and update some training photos. Unfortunately, I encountered the following errors. (train1_errors.log)
Therefore, I use ChatGPT to search for answers and attempted to resolve the issue; however, I encountered the same errors.
I add below codes
after the code https://github.com/Megvii-BaseDetection/YOLOX/blob/ac58e0a5e68e57454b7b9ac822aced493b553c53/yolox/data/datasets/voc.py#L133
The completed codes are below:
After analyzing the problem, I have come up with the following guess below:
I would like to request assistance to determine the cause of the problem and find a solution. Thank you for your help!
Currently, my plan is to create a new PyTorch-GPU virtual environment using conda and configure all the necessary dependencies. Then, I will proceed to train the new set of photos.
My virtual environment is below: