lvpengyuan / masktextspotter.caffe2

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"
Apache License 2.0
261 stars 88 forks source link

How to test on ICDAR2015 training set #21

Open wjp0408 opened 5 years ago

wjp0408 commented 5 years ago

Hi, If I want to output detection results on ICDAR2015 training set, what should I do ? I change "icdar2015_test" to "icdar2015_train" in config yaml file, but i got this error: image

So why this happened... And what should I do ? Thanks :)

wjp0408 commented 5 years ago

I have solved this problem, and It caused by charboxes = []. Sorry to bother that.

DecentMakeover commented 5 years ago

@wjp0408 Hi can you tell me how you solved this problem, even i'm facing a similar issue.

Thanks in advance.

wjp0408 commented 5 years ago

@wjp0408 Hi can you tell me how you solved this problem, even i'm facing a similar issue.

Thanks in advance.

Just modify the code in the end of functionload_gt_from_txt(self, gt_path, height, width) in lib/datasets/text_dataset.py like this:

if len(boxes) > 0:
            if self.use_charann:
                #print(charboxes)
                if not len(charboxes): charboxes = np.zeros((0, 10), dtype=np.float32)
                else: charboxes = np.vstack(charboxes)
                return words, np.array(boxes), np.array(polygons), charboxes, np.array(seg_areas), segmentations
            else:
                charbbs = np.zeros((0, 10), dtype=np.float32)
                return words, np.array(boxes), np.array(polygons), charbbs, np.array(seg_areas), segmentations
else:
            return [], np.zeros((0, 4), dtype=np.float32), np.zeros((0, 8), dtype=np.float32), np.zeros((0, 10), dtype=np.float32), np.zeros((0), dtype=np.float32), []
DecentMakeover commented 5 years ago

Hi @wjp0408 Thank you so much for your time, this works. But since i want to train the network,this will result in Loss in Nan and then exit the training script.

I'm guessing this is because the charboxes are empty, is there a way around this?

Thanks again!

zhengjiawen commented 5 years ago

Hi @DecentMakeover I also encountered the same problem that loss is Nan, do you have any ways? Thanks!

zhoujianwen commented 5 years ago

@zhengjiawen I still have no way to train the model,and I have been reporting errors.Have you solved it yet?

DecentMakeover commented 5 years ago

@zhengjiawen Ah, My best guess is that your data does not have character level annotation.Can you check?

zhoujianwen commented 5 years ago

@DecentMakeover Have you ever encountered such mistakes when training models?

34

zhengjiawen commented 5 years ago

@zhengjiawen Ah, My best guess is that your data does not have character level annotation.Can you check?

Thanks a lot, but I train on ICDAR2015 train dataset. The problem does not reappear when I download pre-trained model.

DecentMakeover commented 5 years ago

but icdar 2015 does not have character level annotation right? only icdar2013 does. How are you training it? How are the results?

zhengjiawen commented 5 years ago

@zhengjiawen I still have no way to train the model,and I have been reporting errors.Have you solved it yet?

你可以试试自己下载预训练模型,而不是使用配置文件里的url

zhengjiawen commented 5 years ago

but icdar 2015 does not have character level annotation right? only icdar2013 does. How are you training it? How are the results?

Yes,I only get coordinate of bounding box, and use crnn to recognition.The result is not good.:)

DecentMakeover commented 5 years ago

Oh ,okay. What do you think of textboxesplusplus?

zhengjiawen commented 5 years ago

Oh ,okay. What do you think of textboxesplusplus?

Sorry, I haven;t read the paper yet. But I think FOTS is efficient enough and performance better.

DecentMakeover commented 5 years ago

do you know of a good implementation? i am currently looking at this one -https://github.com/Pay20Y/FOTS_TF

zhoujianwen commented 5 years ago

@zhengjiawen Thank you. Which pre-training model do you use?Can you share?

zhengjiawen commented 5 years ago

do you know of a good implementation? i am currently looking at this one -https://github.com/Pay20Y/FOTS_TF

I am also looking for a good open source implementation. :)

zhengjiawen commented 5 years ago

@zhengjiawen Thank you. Which pre-training model do you use?

You can download the pre-training model directly. https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl

zhoujianwen commented 5 years ago

@zhengjiawen Thank you very much. I'll try.

zhoujianwen commented 5 years ago

@wjp0408 Hello, wjp0408! How to solve charboxes=[] and no error in training model?

zhoujianwen commented 5 years ago

@zhengjiawen Thank you. Which pre-training model do you use?

You can download the pre-training model directly. https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl

image

Also this error message, how do you solve it?

zhoujianwen commented 5 years ago

@zhengjiawen Thank you. Which pre-training model do you use?

You can download the pre-training model directly. https://dl.fbaipublicfiles.com/detectron/ImageNetPretrained/MSRA/R-50.pkl

image

Also this error message, how do you solve it?

it's solved.thanks.

sdzbft commented 5 years ago

do you know of a good implementation? i am currently looking at this one -https://github.com/Pay20Y/FOTS_TF

hi, can this implemen work? thanks in advance

DecentMakeover commented 5 years ago

The results are not good