Open aravind3134 opened 4 years ago
Sorry for the late reply. Could you please provide the error message? The training procedure should be the same as mmdetection.
I tried to run a config file changing the data location.
In my case, the number of classes are only 2. I also have to change the name of the classes. I think I am getting error only because of it.
Please let me know how to do it. What should be changed?
As of now, I get the following index error:
Traceback (most recent call last):
Traceback (most recent call last):
File "./tools/train.py", line 103, in
Thanks
It seems that there is some problem with your data loader. I suggest you use single process to debug your code, e.g. 1 gpu only, so you could add breakpoint inside your code.
Hey, Can you please tell me the changes required to successfully train a custom data set created in COCO data set format with GCNet?
I think there are two workarounds. Either of them should be fine.
Hey, I am trying to run my own data in same format as COCO dataset and use one of the configuration files to run training. As my data doesn't have segmantation attribute, I tried to run the my dataset and coco dataset with the setting 'with_mask' as 'False' in the config file. Do I need to change something else in the configuration file to make it work?
I am using the config file in this location: configs/gcnet/r50/mask_rcnn_r50_fpn_2x.py
Error:
Traceback (most recent call last): File "./tools/train.py", line 106, in <module> main() File "./tools/train.py", line 101, in main logger=logger) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 65, in train_detector _dist_train(model, dataset, cfg, validate=validate) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 201, in _dist_train runner.run(data_loaders, cfg.workflow, cfg.total_epochs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 361, in run epoch_runner(data_loaders[i], **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/runner/runner.py", line 264, in train self.model, data_batch, train_mode=True, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/apis/train.py", line 44, in batch_processor losses = model(**data) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmcv-0.2.14-py3.6-linux-x86_64.egg/mmcv/parallel/distributed.py", line 50, in forward return self.module(*inputs[0], **kwargs[0]) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__ result = self.forward(*input, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/core/fp16/decorators.py", line 49, in new_func return old_func(*args, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/detectors/base.py", line 86, in forward return self.forward_train(img, img_meta, **kwargs) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/detectors/two_stage.py", line 183, in forward_train sampling_results, gt_masks, self.train_cfg.rcnn) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/models/mask_heads/fcn_mask_head.py", line 112, in get_target gt_masks, rcnn_train_cfg) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/mmdet-0.6.0+a9fcc88-py3.6.egg/mmdet/core/mask/mask_target.py", line 10, in mask_target pos_assigned_gt_inds_list, gt_masks_list, cfg_list) TypeError: 'NoneType' object is not iterable Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/distributed/launch.py", line 235, in <module> main() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/torch/distributed/launch.py", line 231, in main cmd=process.args) subprocess.CalledProcessError: Command '['/home/ubuntu/anaconda3/envs/tensorflow_p36/bin/python', '-u', './tools/train.py', '--local_rank=0', 'configs/gcnet/r50/mask_rcnn_r50_fpn_2x.py', '--launcher', 'pytorch']' returned non-zero exit status 1.
Hey,
I am trying to train custom data using GCNet. I have the data in COCO data format. I want to know the exact procedure to train it. Because, just running the train.sh script, gives me Index error.
I am changing the config file to make it work, but didn't find any luck with that. Please let me know the fields that should be changed to make it work.
Thanks.