MhLiao / MaskTextSpotterV3

The code of "Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting"
Other
626 stars 122 forks source link

Can you give us a detailed instruction on how to use Chinese dataset for training? #17

Open yyr6661 opened 4 years ago

yyr6661 commented 4 years ago

How to remove/disable the character segmentation branch ? How to generate GT ? Thank you for your reply

MhLiao commented 4 years ago

@yyr6661 Try to set CHAR_MASK_ON as False to disable the character segmentation branch. Also, you need to change the number of character classes, rewrite the char2num and num2char. I think you will know how to generate GT according to the two functions.

ZhengWei0918 commented 4 years ago

@MhLiao If I train on Chinese and turn of char mask, should I change the predictor with no "Char" in roi_mask_predictors.py? e.g. SeqMaskRCNNC4Predictor. Because predictor with "Char" will use char mask?

yyr6661 commented 4 years ago

@yyr6661 Try to set CHAR_MASK_ON as False to disable the character segmentation branch. Also, you need to change the number of character classes, rewrite the char2num and num2char. I think you will know how to generate GT according to the two functions.

Thank you very much, I'll try

MhLiao commented 4 years ago

@MhLiao If I train on Chinese and turn of char mask, should I change the predictor with no "Char" in roi_mask_predictors.py? e.g. SeqMaskRCNNC4Predictor. Because predictor with "Char" will use char mask?

@ZhengWei0918 Yes.

yyr6661 commented 4 years ago

@MhLiao If I train on Chinese and turn of char mask, should I change the predictor with no "Char" in roi_mask_predictors.py? e.g. SeqMaskRCNNC4Predictor. Because predictor with "Char" will use char mask?

@ZhengWei0918 Yes.

I changed the number of character classes and fix the bug in issues #21. I also rewritten the char2num and num2char functions and changed the predictor to SeqMaskRCNNC4Predictor. And the CHAR_MASK_ON had been set as False. I used the MLT data (Latin and Chinese images only), follow the IcdarDataset in maskrcnn_benchmark/utils/icdar.py and changed the self.char_classes to my char dict. The training code runs through, but all the losses keeps growing and quickly becomes None. I wish you can help me to point out where the problem may be. I am very grateful for your guidance.

ZhengWei0918 commented 4 years ago

@yyr6661 Try decreasing lr.

yyr6661 commented 4 years ago

It works! Thank you so much

@yyr6661 Try decreasing lr.

wangxupeng commented 4 years ago

After decreasing lr, my losses like below


2020-09-25 03:07:38,155 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:23:55  iter: 1580  loss: 9.0928 (15.2893)  loss_classifier: 0.0150 (0.3822)  loss_box_reg: 0.0012 (0.0345)  loss_mask: 0.2965 (3.8256)  loss_seq: 7.7413 (10.0543)  loss_seg: 0.9924 (0.9927)  time: 1.0187 (1.2112)  data: 0.0053 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:00,482 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:17:38  iter: 1600  loss: 9.4005 (15.2262)  loss_classifier: 0.0139 (0.3777)  loss_box_reg: 0.0013 (0.0341)  loss_mask: 0.3011 (3.7817)  loss_seq: 8.0522 (10.0400)  loss_seg: 0.9946 (0.9927)  time: 0.9544 (1.2100)  data: 0.0075 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:23,848 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:14:40  iter: 1620  loss: 9.9412 (15.1579)  loss_classifier: 0.0252 (0.3734)  loss_box_reg: 0.0011 (0.0337)  loss_mask: 0.2898 (3.7387)  loss_seq: 8.6983 (10.0195)  loss_seg: 0.9926 (0.9927)  time: 1.0346 (1.2095)  data: 0.0068 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:47,250 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:11:53  iter: 1640  loss: 9.8637 (15.0912)  loss_classifier: 0.0126 (0.3692)  loss_box_reg: 0.0007 (0.0333)  loss_mask: 0.2837 (3.6965)  loss_seq: 8.5379 (9.9996)  loss_seg: 0.9929 (0.9927)  time: 1.0549 (1.2090)  data: 0.0074 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:14,061 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:19:21  iter: 1660  loss: 9.1736 (15.0271)  loss_classifier: 0.0129 (0.3650)  loss_box_reg: 0.0009 (0.0329)  loss_mask: 0.2472 (3.6550)  loss_seq: 7.9298 (9.9816)  loss_seg: 0.9945 (0.9927)  time: 1.3214 (1.2106)  data: 0.0058 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:37,091 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:15:27  iter: 1680  loss: 9.2930 (14.9591)  loss_classifier: 0.0120 (0.3609)  loss_box_reg: 0.0007 (0.0325)  loss_mask: 0.2642 (3.6148)  loss_seq: 7.9794 (9.9582)  loss_seg: 0.9955 (0.9927)  time: 1.0234 (1.2099)  data: 0.0061 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:59,721 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:10:28  iter: 1700  loss: 9.4316 (14.8986)  loss_classifier: 0.0112 (0.3568)  loss_box_reg: 0.0007 (0.0322)  loss_mask: 0.3087 (3.5758)  loss_seq: 8.1259 (9.9411)  loss_seg: 0.9952 (0.9927)  time: 0.9884 (1.2089)  data: 0.0070 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:10:21,625 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:03:30  iter: 1720  loss: 9.2848 (14.8398)  loss_classifier: 0.0140 (0.3529)  loss_box_reg: 0.0006 (0.0318)  loss_mask: 0.2653 (3.5375)  loss_seq: 8.0247 (9.9249)  loss_seg: 0.9951 (0.9927)  time: 0.9809 (1.2076)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:10:43,952 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:57:53  iter: 1740  loss: 9.8646 (14.7855)  loss_classifier: 0.0222 (0.3492)  loss_box_reg: 0.0007 (0.0314)  loss_mask: 0.2871 (3.5003)  loss_seq: 8.5450 (9.9119)  loss_seg: 0.9924 (0.9927)  time: 0.9823 (1.2066)  data: 0.0060 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:05,919 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:51:22  iter: 1760  loss: 9.3391 (14.7282)  loss_classifier: 0.0176 (0.3454)  loss_box_reg: 0.0012 (0.0311)  loss_mask: 0.2525 (3.4635)  loss_seq: 8.0760 (9.8955)  loss_seg: 0.9937 (0.9927)  time: 0.9769 (1.2053)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:30,296 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:51:43  iter: 1780  loss: 9.1820 (14.6703)  loss_classifier: 0.0151 (0.3419)  loss_box_reg: 0.0008 (0.0308)  loss_mask: 0.2438 (3.4275)  loss_seq: 7.9375 (9.8775)  loss_seg: 0.9915 (0.9926)  time: 1.0780 (1.2055)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:55,018 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:53:00  iter: 1800  loss: 9.1909 (14.6162)  loss_classifier: 0.0282 (0.3385)  loss_box_reg: 0.0006 (0.0304)  loss_mask: 0.2579 (3.3923)  loss_seq: 7.8874 (9.8623)  loss_seg: 0.9935 (0.9926)  time: 1.3731 (1.2058)  data: 0.0065 (0.0107)  lr: 0.000020  max mem: 5349

and after 5000 steps, we try to test a picture. The model seems cannot detect the right areas, sad. Maybe I need more iteration steps? Hope for your reply, thank U very much. @MhLiao

yuanjiXiang commented 3 years ago

It works! Thank you so much

@yyr6661 Try decreasing lr.

It works! Thank you so much

@yyr6661 Try decreasing lr.

hey, have you success in training in chinese dataset ? could you share your code or methods? please.

Ineedangoodidea commented 3 years ago

when I run: python script.py in MaskTextSpotterV3/evaluation/totaltext/e2e, it outputs: ,zip error: Zip file structure invalid (./cache_files/0.05_0.5_over0.2.zip) Error! Error loading the ZIP archive please help me.

wsygdtws commented 3 years ago

@MhLiao If I train on Chinese and turn of char mask, should I change the predictor with no "Char" in roi_mask_predictors.py? e.g. SeqMaskRCNNC4Predictor. Because predictor with "Char" will use char mask?

@ZhengWei0918 Yes.

I changed the number of character classes and fix the bug in issues #21. I also rewritten the char2num and num2char functions and changed the predictor to SeqMaskRCNNC4Predictor. And the CHAR_MASK_ON had been set as False. I used the MLT data (Latin and Chinese images only), follow the IcdarDataset in maskrcnn_benchmark/utils/icdar.py and changed the self.char_classes to my char dict. The training code runs through, but all the losses keeps growing and quickly becomes None. I wish you can help me to point out where the problem may be. I am very grateful for your guidance.

i do not understand why should we rewrite the char2num and num2char functions, and what kind of change should we do for them, please help me ,thank you

wangyl218 commented 3 years ago

After decreasing lr, my losses like below


2020-09-25 03:07:38,155 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:23:55  iter: 1580  loss: 9.0928 (15.2893)  loss_classifier: 0.0150 (0.3822)  loss_box_reg: 0.0012 (0.0345)  loss_mask: 0.2965 (3.8256)  loss_seq: 7.7413 (10.0543)  loss_seg: 0.9924 (0.9927)  time: 1.0187 (1.2112)  data: 0.0053 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:00,482 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:17:38  iter: 1600  loss: 9.4005 (15.2262)  loss_classifier: 0.0139 (0.3777)  loss_box_reg: 0.0013 (0.0341)  loss_mask: 0.3011 (3.7817)  loss_seq: 8.0522 (10.0400)  loss_seg: 0.9946 (0.9927)  time: 0.9544 (1.2100)  data: 0.0075 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:23,848 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:14:40  iter: 1620  loss: 9.9412 (15.1579)  loss_classifier: 0.0252 (0.3734)  loss_box_reg: 0.0011 (0.0337)  loss_mask: 0.2898 (3.7387)  loss_seq: 8.6983 (10.0195)  loss_seg: 0.9926 (0.9927)  time: 1.0346 (1.2095)  data: 0.0068 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:08:47,250 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:11:53  iter: 1640  loss: 9.8637 (15.0912)  loss_classifier: 0.0126 (0.3692)  loss_box_reg: 0.0007 (0.0333)  loss_mask: 0.2837 (3.6965)  loss_seq: 8.5379 (9.9996)  loss_seg: 0.9929 (0.9927)  time: 1.0549 (1.2090)  data: 0.0074 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:14,061 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:19:21  iter: 1660  loss: 9.1736 (15.0271)  loss_classifier: 0.0129 (0.3650)  loss_box_reg: 0.0009 (0.0329)  loss_mask: 0.2472 (3.6550)  loss_seq: 7.9298 (9.9816)  loss_seg: 0.9945 (0.9927)  time: 1.3214 (1.2106)  data: 0.0058 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:37,091 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:15:27  iter: 1680  loss: 9.2930 (14.9591)  loss_classifier: 0.0120 (0.3609)  loss_box_reg: 0.0007 (0.0325)  loss_mask: 0.2642 (3.6148)  loss_seq: 7.9794 (9.9582)  loss_seg: 0.9955 (0.9927)  time: 1.0234 (1.2099)  data: 0.0061 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:09:59,721 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:10:28  iter: 1700  loss: 9.4316 (14.8986)  loss_classifier: 0.0112 (0.3568)  loss_box_reg: 0.0007 (0.0322)  loss_mask: 0.3087 (3.5758)  loss_seq: 8.1259 (9.9411)  loss_seg: 0.9952 (0.9927)  time: 0.9884 (1.2089)  data: 0.0070 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:10:21,625 maskrcnn_benchmark.trainer INFO: eta: 4 days, 4:03:30  iter: 1720  loss: 9.2848 (14.8398)  loss_classifier: 0.0140 (0.3529)  loss_box_reg: 0.0006 (0.0318)  loss_mask: 0.2653 (3.5375)  loss_seq: 8.0247 (9.9249)  loss_seg: 0.9951 (0.9927)  time: 0.9809 (1.2076)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:10:43,952 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:57:53  iter: 1740  loss: 9.8646 (14.7855)  loss_classifier: 0.0222 (0.3492)  loss_box_reg: 0.0007 (0.0314)  loss_mask: 0.2871 (3.5003)  loss_seq: 8.5450 (9.9119)  loss_seg: 0.9924 (0.9927)  time: 0.9823 (1.2066)  data: 0.0060 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:05,919 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:51:22  iter: 1760  loss: 9.3391 (14.7282)  loss_classifier: 0.0176 (0.3454)  loss_box_reg: 0.0012 (0.0311)  loss_mask: 0.2525 (3.4635)  loss_seq: 8.0760 (9.8955)  loss_seg: 0.9937 (0.9927)  time: 0.9769 (1.2053)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:30,296 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:51:43  iter: 1780  loss: 9.1820 (14.6703)  loss_classifier: 0.0151 (0.3419)  loss_box_reg: 0.0008 (0.0308)  loss_mask: 0.2438 (3.4275)  loss_seq: 7.9375 (9.8775)  loss_seg: 0.9915 (0.9926)  time: 1.0780 (1.2055)  data: 0.0066 (0.0107)  lr: 0.000020  max mem: 5349

2020-09-25 03:11:55,018 maskrcnn_benchmark.trainer INFO: eta: 4 days, 3:53:00  iter: 1800  loss: 9.1909 (14.6162)  loss_classifier: 0.0282 (0.3385)  loss_box_reg: 0.0006 (0.0304)  loss_mask: 0.2579 (3.3923)  loss_seq: 7.8874 (9.8623)  loss_seg: 0.9935 (0.9926)  time: 1.3731 (1.2058)  data: 0.0065 (0.0107)  lr: 0.000020  max mem: 5349

and after 5000 steps, we try to test a picture. The model seems cannot detect the right areas, sad. Maybe I need more iteration steps? Hope for your reply, thank U very much. @MhLiao

@wangxupeng I meet the same problem as yours, did you solve it and train Chinese dataset successful, hope your reply,thank you very much.

oszn commented 3 years ago

@wangyl218 你之后有没有解决这个问题啊,哥。

wangyl218 commented 3 years ago

@wangyl218 你之后有没有解决这个问题啊,哥。

我的数据集都是中文弧形样本,loss_seq始终没有降下来,测试文字检测效果挺好,识别效果不佳。。。

josaphattirza commented 3 years ago

It works! Thank you so much

@yyr6661 Try decreasing lr.

would you mind sharing the steps on doing this? thank you very much!

leehaining commented 2 years ago

有用!非常感谢

@yyr6661尝试减少 lr。

hi,I'm also training Chinese datasets now,but i dont know how to rewrite the char2num and num2char functions and changed the predictor to SeqMaskRCNNC4Predictor. Could you please tell me the detailed steps?

gtb1551050818 commented 2 years ago

@wangyl218 你之后有没有解决这个问题啊,哥。

我的数据集都是中文弧形样本,loss_seq始终没有降下来,测试文字检测效果挺好,识别效果不佳。。。

哥,可以分享下你是怎么对这个模型修改,使其能检测中文的嘛