chenyuntc / simple-faster-rcnn-pytorch

A simplified implemention of Faster R-CNN that replicate performance from origin paper
Other
3.99k stars 1.14k forks source link

Training the model for custom datatset #15

Open mahavird opened 6 years ago

mahavird commented 6 years ago

Hi @chenyuntc,

Thanks for your simplified(simple) implementation of Faster R-CNN in pytorch.

Going by your instructions, I was successfully able to train/test the simple-faster-rcnn-pytorch setup on my system.

Now, I have a custom dataset which has 36 classes, I would like to train a Simple-Faster-R-CNN model (VGG/RES101) for that? I think loading the dataset is the trickiest part, it would be great if you can suggest some tutorial/blog/link for the same.

Where do I get started, what changes need to be made?

I will make sure I open source the work I do during this process.

Thanks!

chenyuntc commented 6 years ago

It could be done by implementing your own data set, see example of: https://github.com/chenyuntc/simple-faster-rcnn-pytorch/blob/master/data/voc_dataset.py

it should return

then replace it it in data/dataset.py

mahavird commented 6 years ago

Hi @chenyuntc,

To train my dataset over your model I structured my dataset similar to VOC2007 format .

For the getting started purpose I took only 100 images and their respective annotation files. I placed the Images into a folder called JPEGImages of VOC2007 directory.

Similarly, I placed the corresponding annotations into a folder called Annotations of VOC 2007 directory.

Later I divided the dataset into train, test, val by placing them into four different text files(train.txt, test.txt,val.txt, trainval.txt) of VOC2007/Imagesets/Main directory.

As suggested by you I changed the labels (VOC_BBOX_LABEL_NAMES ) in voc_dataset.py with my labels.

But after making all the above changes I am getting the following error.

**`python3 train.py train --env='fasterrcnn-caffe' --plot-every=100 --caffe-pretrain

======user config======== {'caffe_pretrain': True, 'caffe_pretrain_path': 'checkpoints/vgg16_caffe.pth', 'data': 'voc', 'debug_file': '/tmp/debugf', 'env': 'fasterrcnn-caffe', 'epoch': 14, 'load_path': None, 'lr': 0.001, 'lr_decay': 0.1, 'max_size': 1000, 'min_size': 600, 'num_workers': 8, 'plot_every': 100, 'port': 8097, 'pretrained_model': 'vgg16', 'roi_sigma': 1.0, 'rpn_sigma': 3.0, 'test_num': 10000, 'test_num_workers': 8, 'use_adam': False, 'use_chainer': False, 'use_drop': False, 'voc_data_dir': '/home/mahavircingularity/simple-faster-rcnn-pytorch/VOCdevkit/VOCdevkit/VOC2007/', 'weight_decay': 0.0005} ==========end============ load data model construct completed 0it [00:00, ?it/s]Traceback (most recent call last): File "train.py", line 131, in fire.Fire() File "/home/cs231n/myVE35/lib/python3.5/site-packages/fire/core.py", line 127, in Fire component_trace = _Fire(component, args, context, name) File "/home/cs231n/myVE35/lib/python3.5/site-packages/fire/core.py", line 366, in _Fire component, remaining_args) File "/home/cs231n/myVE35/lib/python3.5/site-packages/fire/core.py", line 542, in CallCallable result = fn(*varargs, **kwargs) File "train.py", line 77, in train for ii, (img, bbox, label_, scale) in tqdm(enumerate(dataloader)): File "/home/cs231n/myVE35/lib/python3.5/site-packages/tqdm/_tqdm.py", line 872, in iter for obj in iterable: File "/home/cs231n/myVE35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 210, in next return self._process_next_batch(batch) File "/home/cs231n/myVE35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 230, in _process_next _batch raise batch.exc_type(batch.exc_msg) ValueError: Traceback (most recent call last): File "/home/cs231n/myVE35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 42, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/cs231n/myVE35/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 42, in samples = collate_fn([dataset[i] for i in batch_indices]) File "/home/mahavircingularity/simple-faster-rcnn-pytorch/data/dataset.py", line 105, in getitem ori_img, bbox, label, difficult = self.db.get_example(idx) File "/home/mahavircingularity/simple-faster-rcnn-pytorch/data/voc_dataset.py", line 120, in get_example label.append(VOC_BBOX_LABEL_NAMES.index(name)) ValueError: tuple.index(x): x not in tuple If you suspect this is an IPython bug, please report it at: https://github.com/ipython/ipython/issues or send an email to the mailing list at ipython-dev@scipy.org You can print a more detailed traceback right now with "%tb", or use "%debug" to interactively debug it. Extra-detailed tracebacks for bug-reporting purposes can be enabled via: %config Application.verbose_crash=True`**

Do let me know what else changes to be made in your code.I am training on a dataset which contains 36 classes/labels.Also I have used LabelImg to create annotaions in PascalVOC format.

Do let me know if you need further info at my end.

Thanks in advance.

chenyuntc commented 6 years ago

As indicated in the error message:

label.append(VOC_BBOX_LABEL_NAMES.index(name))
ValueError: tuple.index(x): x not in tuple

I think the annotation file is somewhat wrong so that the name is not in VOC_BBOX_LABEL_NAMES.

BTW, you don't need to exactly turn your data to the format of xml. Just write a function and return required items may be easier;

mahavird commented 6 years ago

Hi @chenyuntc,

Thanks for your reply. It was very helpful.

Finally, I am able to use your implementation for training over my own dataset, but still, I have one major issue:

  1. Though in visdom visualization I am able to observe all kinds of other parameters such as loss/test_map, there is no change in the sample image being plotted. Still, it shows the sheep image in there with a label 'shep' whereas my dataset doesn't contain this image. sheep

Can you point out where I might be going wrong?

Regards, Mahavir

chenyuntc commented 6 years ago
mahavird commented 6 years ago

Hi @chenyuntc,

Thanks for the reply. I am looking into the above issue, will keep you posted on the same.

Meanwhile, what is the size (resolution) of images which I should use for training my network?

Regards, Mahavir

chenyuntc commented 6 years ago

what is the size (resolution) of images which I should use for training my network

It depends. In my experiments I crop image in preprocess

penguinshin commented 6 years ago

Hi, seems like there are quite a few hard-coded things that prevent custom datasets from being used that have a different number of classes from VOC2007. Namely, trainer.py hardcodes the confusion matrix with 21 classes (20 + 1), train doesn't load the number of classes from the dataset class names. Are you able to fix this?

LDoubleZhi commented 6 years ago

Hi, have you solve this problem yet?@penguinshin

cmstudyscode commented 6 years ago

@LDoubleZhi have you solve this problem yet?

penguinshin commented 6 years ago

I solved in that I just changed the classes in the dataset file and the utility file that specifies class-related attributes. But I think there should be an update to the code that makes it easily customizable.

SamihaSara commented 3 years ago

@mahavird Can you please share your dataset-related codes with which you were able to run this model? Thanks.

labhigh commented 3 years ago

@mahavird Can you please share your dataset-related codes with which you were able to run this model? Thanks.