训练自己的数据集

Linengyao commented 3 weeks ago

请问我如果想使用您的项目训练自己的模型，在创建数据集部分，我在readme.md里看到有coco.json和val_seq.npy，但是代码中又有xml文件，这让我很迷惑，您能给个指导吗？我该如何构建数据集结构，以此来训练自己的模型

YuHengsss commented 3 weeks ago

Hello, and thank you for your interest in our work! For customizing the dataset, you might find the discussion in this previous issue helpful: https://github.com/YuHengsss/YOLOV/issues/57. In our codebase, the XML file is utilized solely for converting data annotations, while the JSON file is used to train the base detector. If you are looking to adapt this for your own dataset, please refer to the mentioned issue for guidance.

Linengyao commented 3 weeks ago

Hello, and thank you for your interest in our work! For customizing the dataset, you might find the discussion in this previous issue helpful: #57. In our codebase, the XML file is utilized solely for converting data annotations, while the JSON file is used to train the base detector. If you are looking to adapt this for your own dataset, please refer to the mentioned issue for guidance.

我查看了https://github.com/YuHengsss/YOLOV/issues/57 的建议。由于我是新手，我的数据集转换成COCO格式或者YOLO格式我都能理解并且做到，但在 https://github.com/YuHengsss/YOLOV/issues/57 中，您的回答是将数据集转换成OVIS再通过您提供的管道convert_ovis_coco转换成 coco标注。我是否可以理解为我只需要将我的数据集按coco数据集格式组织（通过自己的代码），还是说必须通过您提供的函数？我的工作是使用这个模型对视频进行目标检测训练。你能解答一下吗？

YuHengsss commented 3 weeks ago

The purpose of the dataloader is to organize the dataset and send sequences of frames from a video to the model. This differs from the typical image detection pipeline, because all images in a batch should come from the same video.

You can download the OVIS dataset annotations and organize your dataset in a similar format, then load it using the function provided in this repository. Alternatively, you may need to rewrite the dataloader pipeline to meet the specific requirements mentioned above.

Linengyao commented 3 weeks ago

数据加载器的目的是组织数据集并将帧序列从视频发送到模型。这与典型的图像检测管道不同，因为批处理中的所有图像都应来自同一个视频。

您可以下载 OVIS 数据集注释并以类似的格式组织数据集，然后使用此存储库中提供的函数加载它。或者，您可能需要重写数据加载器管道以满足上述特定要求。

我的数据集是多个视频片段组成的（来源于不同的视频），因此我认为应该是可以使用该模型来进行训练的吧？您在readme.md中写到： Download ILSVRC2015 DET and ILSVRC2015 VID dataset from IMAGENET and organise them as follows:

path to your datasets/ILSVRC2015/
path to your datasets/ILSVRC/

Download our COCO-style annotations for training, FGFA version training annotation and video sequences. Then, put them in these two directories:

YOLOV/annotations/vid_train_coco.json
YOLOV/annotations/ILSVRC_FGFA_COCO.json
YOLOV/yolox/data/dataset/train_seq.npy

我知道ILSVRC2015 DET和 ILSVRC2015 VID的文件结构是不一样的，鉴于我所需要对视频序列进行目标检测这个任务，这意味着我是否可以按照ILSVRC2015 VID数据集的结构重新组织我的数据集结构，然后生成vid_train_coco.json和train_seq.npy两个文件，之后可以进行训练？还是说您仍建议我以OVIS 数据集格式重新组织我的数据集结构，然后使用此存储库中提供的函数加载它？

YuHengsss commented 3 weeks ago

Both of them work!

Linengyao commented 3 weeks ago

Both of them work!

好的，我去尝试一下

Linengyao commented 3 weeks ago

Both of them work!

我已经组织好了结构，并寻求如何训练。在查看相关的issue之后，我是不是可以这样理解：按照ILSVRC2015_VID数据集格式重构自己的数据集，然后生成一个train.json文件和train.npy文件，先使用train.json文件训练YOLOX的模型，得到一个YOLOX的预训练模型，然后把得到的这个YOLOX的模型作为YOLOV的预训练模型进行添加，YOLOV会使用train.npy文件进行训练。为此，我需要先去网上下载一份YOLOX工程的代码，并用它训练得到一个预训练模型，而后使用该模型作为你工程的预训练模型，按照YOLOV的训练流程再进行训练？

YuHengsss commented 3 weeks ago

Yep. As a reminder, this repo contains the code to train the base detector.

Linengyao commented 3 weeks ago

Yep. As a reminder, this repo contains the code to train the base detector.

我如果想将您的结构搬迁于yolov8项目上，是不是要进行以下工作： 1、搬迁YOLOV网络的主要结构（后处理过程）：YOLOV\yolox\models\post_process.py文件和YOLOV\yolox\models\post_trans.py文件 2、搬迁数据载入部分：YOLOV\yolox\data\datasets\vid.py和YOLOV\yolox\data\datasets\vid_classes.py文件然后就如同您原来的工作方式，先训练YOLOv8得到预训练模型，然后按照ILSVRC2015_VID数据集格式重构自己的数据集，生成train.npy文件和val.npy文件，将前面训练好的YOLOv8模型作为预训练模型，使用train.npy进行训练，然后使用val.npy进行预测

YuHengsss / YOLOV

训练自己的数据集 #83