Train custom coco dataset error

lombardata commented 1 year ago

Hi everyone, I would like to train a model (for example "maskformer2_R50_bs16_50ep") on my custom instance segmentation dataset. I've registered the dataset but when I run the train scrip： python train_net.py --num-gpus 1 --config-file configs/coco/instance-segmentation/maskformer2_R50_bs16_50ep.yaml I get the following error : WARNING [01/24 14:25:38 fvcore.common.checkpoint]: The checkpoint state_dict contains keys that are not used by the model: stem.fc.{bias, weight} [01/24 14:25:38 d2.engine.train_loop]: Starting training from iteration 0 ERROR [01/24 14:25:39 d2.engine.train_loop]: Exception during training: Traceback (most recent call last): File "/home/mcontini/detectron2/detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/home/mcontini/detectron2/detectron2/engine/defaults.py", line 494, in run_step self._trainer.run_step() File "/home/mcontini/detectron2/detectron2/engine/train_loop.py", line 421, in run_step with autocast(dtype=self.precision): TypeError: __init__() got an unexpected keyword argument 'dtype' [01/24 14:25:39 d2.engine.hooks]: Total training time: 0:00:01 (0:00:00 on hooks) [01/24 14:25:39 d2.utils.events]: iter: 0 lr: N/A max_mem: 171M Traceback (most recent call last): File "train_net.py", line 370, in <module> launch( File "/home/mcontini/detectron2/detectron2/engine/launch.py", line 82, in launch main_func(*args) File "train_net.py", line 317, in main return trainer.train() File "/home/mcontini/detectron2/detectron2/engine/defaults.py", line 484, in train super().train(self.start_iter, self.max_iter) File "/home/mcontini/detectron2/detectron2/engine/train_loop.py", line 149, in train self.run_step() File "/home/mcontini/detectron2/detectron2/engine/defaults.py", line 494, in run_step self._trainer.run_step() File "/home/mcontini/detectron2/detectron2/engine/train_loop.py", line 421, in run_step with autocast(dtype=self.precision): TypeError: __init__() got an unexpected keyword argument 'dtype'

Does anyone have any idea on how to solve it? Thank you in advance :)

namasang1 commented 1 year ago

I got a truly same error, Did you solve the problems? :)

lombardata commented 1 year ago

No, but I'm trying this now : https://huggingface.co/docs/transformers/main/model_doc/mask2former and huggingface seems more intuitive and simple :) Hope this helps

sushilkhadkaanon commented 1 year ago

@lombardata @namasang1 I had the same error. I solved that by modifying /home/mcontini/detectron2/detectron2/engine/train_loop.py this file.

Just remove the statement on of line 421: dtype=self.precision. Let me know if it works.

Ludobico commented 1 year ago

when i remove below the code, I can train dataset, but all mAP output as 0 line 421: dtype=self.precision

adityakankariya commented 1 year ago

There is no need to have any detectron2 directory locally, maybe try running it without detectron2 in your local machine as they specify in ‘Getting Started.’

Unrelated: I was getting mAP as 0 as well until I increased cfg.SOVLER.MAX_ITER in the config file I was using inside Mask2Former. This was because I was using a very small custom dataset, so it could possibly have something to do with the size of your dataset

aadityaks commented 1 year ago

getting same error!

Jordy-VL commented 1 year ago

I fixed it by using in my detectron2 config (YAML):

  AMP:
    ENABLED: False

cinkyzhang commented 1 year ago

Hello, I got a truly same error, Did you solve the problems?

Dawn-bin commented 11 months ago

guys! Please check the version of detetron2 and pytorch. Update pytorch version to 1.10+.

amuse-dh commented 9 months ago

I encountered the same error. Will updating PyTorch solve this issue? If anyone has resolved this error, please share how to fix it... Please...

suoniliu commented 1 month ago

guys! Please check the version of detetron2 and pytorch. Update pytorch version to 1.10+.

Hello, I encountered the same error. And I update pytorch version to 1.10. However, I encountered a new error: ImportError: /home/liusn/anaconda3/envs/mask2former/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv.

Has anyone encountered the same error？

facebookresearch / Mask2Former

Train custom coco dataset error #173