How to train Custom Dataset

JDAI-CV / fast-reid

SOTA Re-identification Methods and Toolbox

Apache License 2.0

3.43k stars 837 forks source link

How to train Custom Dataset #220

Closed L1aoXingyu closed 2 years ago

L1aoXingyu commented 4 years ago

This guide explains how to train your own custom dataset with fastreid's data loaders.

Before You Start

Following Getting Started to setup the environment and install requirements.txt dependencies.

Train on Custom Dataset

Register your dataset (i.e., tell fastreid how to obtain your dataset).

To let fastreid know how to obtain a dataset named "my_dataset", users need to implement a Class that inherits fastreid.data.datasets.bases.ImageDataset:
```
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class MyOwnDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        ...
        super().__init__(train, query, gallery)     
```
Here, the snippet associates a dataset named "MyOwnDataset" with a class that processes train set, query set and gallery set and then pass to the baseClass. Then add a decorator to this class for registration.

The class can do arbitrary things and should generate train list: list(str, str, str), query list: list(str, int, int) and gallery list: list(str, int, int) as below.
```
train_list = [
(train_path1, pid1, camid1), (train_path2, pid2, camid2), ...]

query_list = [
(query_path1, pid1, camid1), (query_path2, pid2, camid2), ...]

gallery_list = [
(gallery_path1, pid1, camid1), (gallery_path2, pid2, camid2), ...]
```
You can also pass an empty train_list to generate a "Testset" only with super().__init__([], query, gallery).

Notice: query and gallery sets could have the same camera views, but for each individual query identity, his/her gallery samples from the same camera are excluded. So if your dataset has no camera annotations, you can set all query identities camera number to 0 and all gallery identities camera number to 1, then you can get the testing results.
Import your dataset.

Aftre registering your own dataset, you need to import it in train_net.py to make it effective.
```
from dataset_file import MyOwnDataset
```

AnhPC03 commented 4 years ago

@L1aoXingyu I don't understand clearly your documentation for custom dataset. I've tried your way but got below error:

Traceback (most recent call last):
  File "tools/train_net.py", line 67, in <module>
    args=(args,),
  File "./fastreid/engine/launch.py", line 71, in launch
    main_func(*args)
  File "tools/train_net.py", line 53, in main
    trainer = Trainer(cfg)
  File "./fastreid/engine/defaults.py", line 204, in __init__
    data_loader = self.build_train_loader(cfg)
  File "./fastreid/engine/defaults.py", line 408, in build_train_loader
    return build_reid_train_loader(cfg)
  File "./fastreid/data/build.py", line 27, in build_reid_train_loader
    dataset = DATASET_REGISTRY.get(d)(root=_root, combineall=cfg.DATASETS.COMBINEALL)
  File "./fastreid/product_dataset.py", line 8, in __init__
    super().__init__(train, query, gallery)
NameError: name 'train' is not defined

And this is my product_dataset.py file in fastreid folder:

from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class ProductDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        ...
        super().__init__(train, query, gallery)

Although I had removed ... in the above file but got the same error. The ProductDataset folder is in the datasets folder and has structure as following:

.
├── gallery
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
├── query
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
└── train
    ├── data_38
    ├── data_43
    ├── data_68
    ├── data_gro
    └── data_grocery

And in the each child folder, has the structure as below (ex: train/data_38/):

.
├── 1
├── 10
├── 11
├── 12
├── 13
├── 14
├── 15
├── 16
├── 17
├── 18
├── 19
├── 2
├── 20
├── 21
├── 22
├── 23
├── 24
├── 25
├── 26
├── 27
├── 28
├── 29
├── 3
├── 30
├── 31
├── 32
├── 33
├── 34
├── 35
├── 36
├── 37
├── 38
├── 4
├── 5
├── 6
├── 7
├── 8
└── 9

In each above number folder has some images.

AnhPC03 commented 4 years ago

@L1aoXingyu I don't understand clearly your documentation for custom dataset. I've tried your way but got below error:

Traceback (most recent call last):
  File "tools/train_net.py", line 67, in <module>
    args=(args,),
  File "./fastreid/engine/launch.py", line 71, in launch
    main_func(*args)
  File "tools/train_net.py", line 53, in main
    trainer = Trainer(cfg)
  File "./fastreid/engine/defaults.py", line 204, in __init__
    data_loader = self.build_train_loader(cfg)
  File "./fastreid/engine/defaults.py", line 408, in build_train_loader
    return build_reid_train_loader(cfg)
  File "./fastreid/data/build.py", line 27, in build_reid_train_loader
    dataset = DATASET_REGISTRY.get(d)(root=_root, combineall=cfg.DATASETS.COMBINEALL)
  File "./fastreid/product_dataset.py", line 8, in __init__
    super().__init__(train, query, gallery)
NameError: name 'train' is not defined

And this is my product_dataset.py file in fastreid folder:

from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class ProductDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        ...
        super().__init__(train, query, gallery)

Although I had removed ... in the above file but got the same error. The ProductDataset folder is in the datasets folder and has structure as following:

.
├── gallery
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
├── query
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
└── train
    ├── data_38
    ├── data_43
    ├── data_68
    ├── data_gro
    └── data_grocery

And in the each child folder, has the structure as below (ex: train/data_38/):

.
├── 1
├── 10
├── 11
├── 12
├── 13
├── 14
├── 15
├── 16
├── 17
├── 18
├── 19
├── 2
├── 20
├── 21
├── 22
├── 23
├── 24
├── 25
├── 26
├── 27
├── 28
├── 29
├── 3
├── 30
├── 31
├── 32
├── 33
├── 34
├── 35
├── 36
├── 37
├── 38
├── 4
├── 5
├── 6
├── 7
├── 8
└── 9

In each above number folder has some images.

I've solve the problems with my datasets. And the key was that, train, query or gallery pass via super()__init__(train, query, gallery) must be list of tuples. Each tuple has structure as (path/to/image/, pid, camid)

L1aoXingyu commented 4 years ago

@AnhPC03 Yes, you are right! It doesn't matter how your data structure is. The key idea is preparing the train, query and gallery as required and then pass via super().__init__(train, query, gallery).

addisonklinke commented 4 years ago

@L1aoXingyu What is the purpose of formatting the train pid and camid values as strings instead of integers? It seems like the later would make more sense to be consistent with the formatting of the query and gallery sets

L1aoXingyu commented 4 years ago

@addisonklinke When combining two or more datasets to train, the integers will be confusing because 0 will be different ids in different datasets.

addisonklinke commented 4 years ago

@L1aoXingyu I see, that makes sense. Thank you for the clarification.

Another question I had is whether there are guidelines for splitting a dataset into train, query, and gallery subsets. Obviously, we want the identity IDs in train to be mutually exclusive with those in query and gallery in order to have an unbiased evaluation. However, when constructing query and gallery I am wondering...

Is there a typical percent of the dataset used for these? i.e. 75% train / 25% query + gallery
Do dataset creators usually require a minimum number of instances per identity in the query + gallery sets? Having too few instances seems like it could make the queries substantially harder
Once you have a set of identity IDs that are mutually exclusive with train, is there a rule of thumb for determining how many should belong in the query vs. gallery?

younghuvee commented 4 years ago

Hello, I train the model on my own dataset, but I will be stuck in the process of data loading during training, that is, here, why is this?

    def _try_put_index(self):
        assert self._tasks_outstanding < 2 * self._num_workers
        try:
            index = self._next_index()
        except StopIteration:
            return
        for _ in range(self._num_workers):  # find the next active worker, if any
            worker_queue_idx = next(self._worker_queue_idx_cycle)
            if self._workers_status[worker_queue_idx]:
                break
        else:
            # not found (i.e., didn't break)
            return

vicwer commented 4 years ago

hi，您好，我是否可以把自己的数据集图片重命名成market1501的格式放进market数据集文件夹中，使用market的配置直接训练？

L1aoXingyu commented 4 years ago

@vicwer 这种方式也可以，但是还是推荐使用上面写的自定义数据集配置。

AnhPC03 commented 3 years ago

@L1aoXingyu I want to train your fast-reid repo for classification. And my dataset has following structure:

├── train
│   ├── beverage_bottle
│   ├── box
│   ├── candy_bag
│   ├── candy_jar
│   ├── cylinder
│   ├── instant_food_cup
│   ├── juice_box
│   └── tiny_candy
└── val
    ├── beverage_bottle
    ├── box
    ├── candy_bag
    ├── candy_jar
    ├── cylinder
    ├── instant_food_cup
    ├── juice_box
    └── tiny_candy

In each child folder, are some images

I had written dataloader as below:

import os
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class SuperClassDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        train_path = root + '/super_class_dataset/train'
        val_path = root + '/super_class_dataset/val'
        gallery_path = root + '/super_class_dataset/train'

        self.convert_labels = {
            'beverage_bottle': 1,
            'box': 2,
            'candy_bag': 3,
            'candy_jar': 4,
            'cylinder': 5,
            'instant_food_cup': 6,
            'juice_box': 7,
            'tiny_candy': 8,
        }

        train_data = self.get_data(train_path, 1)
        val_data = self.get_data(val_path, 2)
        gallery_data = self.get_data(gallery_path, 3)

        super().__init__(train_data, val_data, gallery_data)
    def get_data(self, path, cam_id):
        data = []
        absolute_path = os.path.join(path)
        sub_1_dirs = os.listdir(absolute_path)
        for sub_1_dir in sub_1_dirs:
            sub_1_path = os.path.join(absolute_path, sub_1_dir)
            if sub_1_dir == '.DS_Store':
                continue
            filenames = os.listdir(sub_1_path)
            for filename in filenames:
                if filename == '.DS_Store':
                    continue
                filepath = os.path.join(sub_1_path, filename)
                data.append((filepath, self.convert_labels[sub_1_dir], cam_id))
        return data

I had use train dataset as role of query dataset, and val as role of test dataset. But when i was training, i got the error:

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

But when i printed, tensor a equaled tensor b in shape everytime. Could you give me suggestion in dataloader for classification?

Mulbetty commented 3 years ago

按照你上面流程里配置完成之后，config.yml文件中的name 和继承的MyOwnDataset类中的参数 ”datasetname“ 是不是要统一？

L1aoXingyu commented 3 years ago

按照你上面流程里配置完成之后，config.yml文件中的name 和继承的MyOwnDataset类中的参数 ”datasetname“ 是不是要统一？

配置文件里面的数据集名字需要和自己定义的数据集名字匹配，比如上面的例子，在 config 里面需要写成 "SuperClassDataset"

rafaelbate commented 3 years ago

This guide explains how to train your own custom dataset with fastreid's data loaders.

Before You Start

Following Getting Started to setup the environment and install requirements.txt dependencies.

Train on Custom Dataset
Register your dataset (i.e., tell fastreid how to obtain your dataset). To let fastreid know how to obtain a dataset named "my_dataset", users need to implement a Class that inherits fastreid.data.datasets.bases.ImageDataset:
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class MyOwnDataset(ImageDataset):
  def __init__(self, root='datasets', **kwargs):
      ...
      super().__init__(train, query, gallery)     
Here, the snippet associates a dataset named "MyOwnDataset" with a class that processes train set, query set and gallery set and then pass to the baseClass. Then add a decorator to this class for registration. The class can do arbitrary things and should generate train list: list(str, str, str), query list: list(str, int, int) and gallery list: list(str, int, int) as below.
train_list = [
(train_path1, pid1, camid1), (train_path2, pid2, camid2), ...]

query_list = [
(query_path1, pid1, camid1), (query_path2, pid2, camid2), ...]

gallery_list = [
(gallery_path1, pid1, camid1), (gallery_path2, pid2, camid2), ...]
You can also pass an empty train_list to generate a "Testset" only with super().__init__([], query, gallery). Notice: query and gallery sets could have the same camera views, but for each individual query identity, his/her gallery samples from the same camera are excluded. So if your dataset has no camera annotations, you can set all query identities camera number to 0 and all gallery identities camera number to 1, then you can get the testing results.
Import your dataset. Aftre registering your own dataset, you need to import it in train_net.py to make it effective.
from dataset_file import MyOwnDataset

Hello @L1aoXingyu. First of all, thank you for you amazing work! If I understand correctly, I can train FastReID to re-identify any custom object I want right? In my case, I need to be able to re-identify a certain fruit. So I just need a dataset containing images of that fruit, right?

Thank you for your contribution!

L1aoXingyu commented 3 years ago

This guide explains how to train your own custom dataset with fastreid's data loaders.

Before You Start

Following Getting Started to setup the environment and install requirements.txt dependencies.

Train on Custom Dataset
Register your dataset (i.e., tell fastreid how to obtain your dataset). To let fastreid know how to obtain a dataset named "my_dataset", users need to implement a Class that inherits fastreid.data.datasets.bases.ImageDataset:
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class MyOwnDataset(ImageDataset):
def __init__(self, root='datasets', **kwargs):
    ...
    super().__init__(train, query, gallery)     
Here, the snippet associates a dataset named "MyOwnDataset" with a class that processes train set, query set and gallery set and then pass to the baseClass. Then add a decorator to this class for registration. The class can do arbitrary things and should generate train list: list(str, str, str), query list: list(str, int, int) and gallery list: list(str, int, int) as below.
train_list = [
(train_path1, pid1, camid1), (train_path2, pid2, camid2), ...]

query_list = [
(query_path1, pid1, camid1), (query_path2, pid2, camid2), ...]

gallery_list = [
(gallery_path1, pid1, camid1), (gallery_path2, pid2, camid2), ...]
You can also pass an empty train_list to generate a "Testset" only with super().__init__([], query, gallery). Notice: query and gallery sets could have the same camera views, but for each individual query identity, his/her gallery samples from the same camera are excluded. So if your dataset has no camera annotations, you can set all query identities camera number to 0 and all gallery identities camera number to 1, then you can get the testing results.
Import your dataset. Aftre registering your own dataset, you need to import it in train_net.py to make it effective.
from dataset_file import MyOwnDataset
Hello @L1aoXingyu. First of all, thank you for you amazing work! If I understand correctly, I can train FastReID to re-identify any custom object I want right? In my case, I need to be able to re-identify a certain fruit. So I just need a dataset containing images of that fruit, right?

Thank you for your contribution!

Yes, if you want to train a model for identifying different fruits, you can collect a dataset with different kinds of fruits and train on it.

rrjia commented 3 years ago

ted, tensor a equaled tensor b in shape everytime

同样遇到这个bug了

shreejalt commented 3 years ago

@L1aoXingyu I want to train your fast-reid repo for classification. And my dataset has following structure:

├── train
│   ├── beverage_bottle
│   ├── box
│   ├── candy_bag
│   ├── candy_jar
│   ├── cylinder
│   ├── instant_food_cup
│   ├── juice_box
│   └── tiny_candy
└── val
    ├── beverage_bottle
    ├── box
    ├── candy_bag
    ├── candy_jar
    ├── cylinder
    ├── instant_food_cup
    ├── juice_box
    └── tiny_candy

In each child folder, are some images

I had written dataloader as below:

import os
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class SuperClassDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        train_path = root + '/super_class_dataset/train'
        val_path = root + '/super_class_dataset/val'
        gallery_path = root + '/super_class_dataset/train'

        self.convert_labels = {
            'beverage_bottle': 1,
            'box': 2,
            'candy_bag': 3,
            'candy_jar': 4,
            'cylinder': 5,
            'instant_food_cup': 6,
            'juice_box': 7,
            'tiny_candy': 8,
        }

        train_data = self.get_data(train_path, 1)
        val_data = self.get_data(val_path, 2)
        gallery_data = self.get_data(gallery_path, 3)

        super().__init__(train_data, val_data, gallery_data)
    def get_data(self, path, cam_id):
        data = []
        absolute_path = os.path.join(path)
        sub_1_dirs = os.listdir(absolute_path)
        for sub_1_dir in sub_1_dirs:
            sub_1_path = os.path.join(absolute_path, sub_1_dir)
            if sub_1_dir == '.DS_Store':
                continue
            filenames = os.listdir(sub_1_path)
            for filename in filenames:
                if filename == '.DS_Store':
                    continue
                filepath = os.path.join(sub_1_path, filename)
                data.append((filepath, self.convert_labels[sub_1_dir], cam_id))
        return data

I had use train dataset as role of query dataset, and val as role of test dataset. But when i was training, i got the error:

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

But when i printed, tensor a equaled tensor b in shape everytime. Could you give me suggestion in dataloader for classification?

@AnhPC03 Did you solved this issue? It would be helpful for me if you can guide me through the error

github-actions[bot] commented 3 years ago

This issue is stale because it has been open for 30 days with no activity.

akashAD98 commented 3 years ago

@AnhPC03 can you please tell us how did you solved this issue?| Is there a typical percent of the dataset used for these? i.e. 75% train / 25% query + gallery ? should we need to add same images in train query gallary ?

AnhPC03 commented 3 years ago

@shreejalt @akashAD98 Did you guys get this error RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0, right? If yes, please check that if images in your dataset had alpha channel? And then remove the alpha channel, only keep B,G,R channels.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 2 years ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

Cippppy commented 9 months ago

@L1aoXingyu I want to train your fast-reid repo for classification. And my dataset has following structure:

├── train
│   ├── beverage_bottle
│   ├── box
│   ├── candy_bag
│   ├── candy_jar
│   ├── cylinder
│   ├── instant_food_cup
│   ├── juice_box
│   └── tiny_candy
└── val
    ├── beverage_bottle
    ├── box
    ├── candy_bag
    ├── candy_jar
    ├── cylinder
    ├── instant_food_cup
    ├── juice_box
    └── tiny_candy

In each child folder, are some images

I had written dataloader as below:

import os
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class SuperClassDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        train_path = root + '/super_class_dataset/train'
        val_path = root + '/super_class_dataset/val'
        gallery_path = root + '/super_class_dataset/train'

        self.convert_labels = {
            'beverage_bottle': 1,
            'box': 2,
            'candy_bag': 3,
            'candy_jar': 4,
            'cylinder': 5,
            'instant_food_cup': 6,
            'juice_box': 7,
            'tiny_candy': 8,
        }

        train_data = self.get_data(train_path, 1)
        val_data = self.get_data(val_path, 2)
        gallery_data = self.get_data(gallery_path, 3)

        super().__init__(train_data, val_data, gallery_data)
    def get_data(self, path, cam_id):
        data = []
        absolute_path = os.path.join(path)
        sub_1_dirs = os.listdir(absolute_path)
        for sub_1_dir in sub_1_dirs:
            sub_1_path = os.path.join(absolute_path, sub_1_dir)
            if sub_1_dir == '.DS_Store':
                continue
            filenames = os.listdir(sub_1_path)
            for filename in filenames:
                if filename == '.DS_Store':
                    continue
                filepath = os.path.join(sub_1_path, filename)
                data.append((filepath, self.convert_labels[sub_1_dir], cam_id))
        return data

I had use train dataset as role of query dataset, and val as role of test dataset. But when i was training, i got the error:

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

But when i printed, tensor a equaled tensor b in shape everytime. Could you give me suggestion in dataloader for classification?

Hey @AnhPC03, I know its been a few years but I am trying to build a classifer like you, but I can't get it working. Did you ever run into an issue where the trainer stalls forever without erroring out?

bai-0829 commented 3 months ago

我想训练你的快速 reid 存储库进行分类。我的数据集具有以下结构：

├── train
│   ├── beverage_bottle
│   ├── box
│   ├── candy_bag
│   ├── candy_jar
│   ├── cylinder
│   ├── instant_food_cup
│   ├── juice_box
│   └── tiny_candy
└── val
    ├── beverage_bottle
    ├── box
    ├── candy_bag
    ├── candy_jar
    ├── cylinder
    ├── instant_food_cup
    ├── juice_box
    └── tiny_candy

In each child folder, are some images

我写了数据加载器如下：

import os
from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class SuperClassDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        train_path = root + '/super_class_dataset/train'
        val_path = root + '/super_class_dataset/val'
        gallery_path = root + '/super_class_dataset/train'

        self.convert_labels = {
            'beverage_bottle': 1,
            'box': 2,
            'candy_bag': 3,
            'candy_jar': 4,
            'cylinder': 5,
            'instant_food_cup': 6,
            'juice_box': 7,
            'tiny_candy': 8,
        }

        train_data = self.get_data(train_path, 1)
        val_data = self.get_data(val_path, 2)
        gallery_data = self.get_data(gallery_path, 3)

        super().__init__(train_data, val_data, gallery_data)
    def get_data(self, path, cam_id):
        data = []
        absolute_path = os.path.join(path)
        sub_1_dirs = os.listdir(absolute_path)
        for sub_1_dir in sub_1_dirs:
            sub_1_path = os.path.join(absolute_path, sub_1_dir)
            if sub_1_dir == '.DS_Store':
                continue
            filenames = os.listdir(sub_1_path)
            for filename in filenames:
                if filename == '.DS_Store':
                    continue
                filepath = os.path.join(sub_1_path, filename)
                data.append((filepath, self.convert_labels[sub_1_dir], cam_id))
        return data

我使用训练数据集作为查询数据集的角色，并使用 val 作为测试数据集的角色。但是当我在训练时，我得到了错误：

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 0

但是当我打印时，张量 a 每次都等于张量 b。你能给我在数据加载器中分类的建议吗？

嘿，我知道已经有几年了，但我正在尝试建立一个像你这样的分类器，但我无法让它工作。你有没有遇到过训练器永远失速而没有出错的问题？

你好，请问您解决了这个问题吗？，我现在也是训练不报错，但是卡死不运行状态

Thomas-CHOCHOY commented 3 months ago

@L1aoXingyu I don't understand clearly your documentation for custom dataset. I've tried your way but got below error:

Traceback (most recent call last):
  File "tools/train_net.py", line 67, in <module>
    args=(args,),
  File "./fastreid/engine/launch.py", line 71, in launch
    main_func(*args)
  File "tools/train_net.py", line 53, in main
    trainer = Trainer(cfg)
  File "./fastreid/engine/defaults.py", line 204, in __init__
    data_loader = self.build_train_loader(cfg)
  File "./fastreid/engine/defaults.py", line 408, in build_train_loader
    return build_reid_train_loader(cfg)
  File "./fastreid/data/build.py", line 27, in build_reid_train_loader
    dataset = DATASET_REGISTRY.get(d)(root=_root, combineall=cfg.DATASETS.COMBINEALL)
  File "./fastreid/product_dataset.py", line 8, in __init__
    super().__init__(train, query, gallery)
NameError: name 'train' is not defined

And this is my product_dataset.py file in fastreid folder:

from fastreid.data.datasets import DATASET_REGISTRY
from fastreid.data.datasets.bases import ImageDataset

@DATASET_REGISTRY.register()
class ProductDataset(ImageDataset):
    def __init__(self, root='datasets', **kwargs):
        ...
        super().__init__(train, query, gallery)

Although I had removed ... in the above file but got the same error. The ProductDataset folder is in the datasets folder and has structure as following:

.
├── gallery
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
├── query
│   ├── data_38
│   ├── data_43
│   ├── data_68
│   ├── data_gro
│   └── data_grocery
└── train
    ├── data_38
    ├── data_43
    ├── data_68
    ├── data_gro
    └── data_grocery

And in the each child folder, has the structure as below (ex: train/data_38/):

.
├── 1
├── 10
├── 11
├── 12
├── 13
├── 14
├── 15
├── 16
├── 17
├── 18
├── 19
├── 2
├── 20
├── 21
├── 22
├── 23
├── 24
├── 25
├── 26
├── 27
├── 28
├── 29
├── 3
├── 30
├── 31
├── 32
├── 33
├── 34
├── 35
├── 36
├── 37
├── 38
├── 4
├── 5
├── 6
├── 7
├── 8
└── 9

In each above number folder has some images.

Hi, how did you name the images ?