JiaquanYe / TableMASTER-mmocr

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
Apache License 2.0
430 stars 103 forks source link

Checkpoint file #5

Open victor-ab opened 3 years ago

victor-ab commented 3 years ago

Hi!

Thanks so much for strengthening the open-source community with this project!

Will you also share the config and the checkpoint file? Seems like it requires a LOT of computing power to train it.

delveintodetail commented 3 years ago

Hi!

Thanks so much for strengthening the open-source community with this project!

Will you also share the config and the checkpoint file? Seems like it requires a LOT of computing power to train it.

Thanks for your interest. We plan to release some pre-trained models in around two or three weeks. We are not totally sure of this. We will notify you when I release the models.

For PSENet text line detection, it takes 4-8 hours to finish the training with 8 V100 GPUs (16 GB). For table structure model, it takes around 36-48 hours with 8 V100 GPUs (16 GB). For text line recognition, it takes around 36 hours with 8 V100 GPUs (16 GB).

victor-ab commented 3 years ago

Hi!

Yes, this computing power is beyond my budget. Hahaha Are you not sure about the release date or not sure if you are going to release it?

Anyway, thanks for the fast reply!

On Thu, 12 Aug 2021, 12:52 am delveintodetail, @.***> wrote:

Hi!

Thanks so much for strengthening the open-source community with this project!

Will you also share the config and the checkpoint file? Seems like it requires a LOT of computing power to train it.

Thanks for your interest. We plan to release some pre-trained models in around two or three weeks. We are not totally sure of this. We will notify you when I release the models.

For PSENet text line detection, it takes 4-8 hours to finish the training with 8 V100 GPUs (16 GB). For table structure model, it takes around 36-48 hours with 8 V100 GPUs (16 GB). For text line recognition, it takes around 36 hours with 8 V100 GPUs (16 GB).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JiaquanYe/TableMASTER-mmocr/issues/5#issuecomment-897327724, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKYMCU53ZOTIP3JYXQKH6DT4NAOHANCNFSM5B7JPPYQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

delveintodetail commented 3 years ago

Hi! Yes, this computing power is beyond my budget. Hahaha Are you not sure about the release date or not sure if you are going to release it? Anyway, thanks for the fast reply! On Thu, 12 Aug 2021, 12:52 am delveintodetail, @.***> wrote: Hi! Thanks so much for strengthening the open-source community with this project! Will you also share the config and the checkpoint file? Seems like it requires a LOT of computing power to train it. Thanks for your interest. We plan to release some pre-trained models in around two or three weeks. We are not totally sure of this. We will notify you when I release the models. For PSENet text line detection, it takes 4-8 hours to finish the training with 8 V100 GPUs (16 GB). For table structure model, it takes around 36-48 hours with 8 V100 GPUs (16 GB). For text line recognition, it takes around 36 hours with 8 V100 GPUs (16 GB). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#5 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGKYMCU53ZOTIP3JYXQKH6DT4NAOHANCNFSM5B7JPPYQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

Sure, we will release the pre-trained models. We are not sure about the time.

huyhoang17 commented 3 years ago

@delveintodetail Hi, thanks for the great repo! Will you continue to have a plan to release a pre-trained model? Thanks in advance.

delveintodetail commented 3 years ago

@delveintodetail Hi, thanks for the great repo! Will you continue to have a plan to release a pre-trained model? Thanks in advance.

Sure and definitely, @JiaquanYe will update some pre-trained models soon. Sorry for the delay due to the limited GPU resource.

delveintodetail commented 3 years ago

@victor-ab @huyhoang17 We have released the table structure reconstruction pre-trained models. Please check it. We will release text line recognition model soon.

victor-ab commented 3 years ago

@delveintodetail Can you please add specific details on the cfg in the table_inference.py? I find it difficult to identify where to use the pretrained pth file. Something like infer_single_image.py would help.

JiaquanYe commented 3 years ago

@delveintodetail Can you please add specific details on the cfg in the table_inference.py? I find it difficult to identify where to use the pretrained pth file. Something like infer_single_image.py would help.

Here I show the details on the keys of ‘cfg’ in the table_inference.py. cfg = { 'pse_config': {str type, the path of textline detection mission config}, 'master_config': {str type, the path of textline recognition mission config}, 'structure_master_config': {str type, the path of table structure restruction mission config}, 'pse_ckpt': {str type, the checkpoint file path of textline detection model}, 'master_ckpt': {str type, the checkpoint file path of textline recognition model}, 'structure_master_ckpt': {str type, the checkpoint file path of table structure restruction model}, 'end2end_result_folder': {str type, the path store the textline detection and recognition inference results}, 'structure_master_result_folder': {str type, the path store the table structure restruction inference results}, 'test_folder': {str type, test images folder},

'test_folder':'./smallVal10'

    'chunks_nums':chunk_nums
}
dcdethan commented 3 years ago

@victor-ab @huyhoang17 We have released the table structure reconstruction pre-trained models. Please check it. We will release text line recognition model soon.

Is there will be updating text line detection model?

dcdethan commented 3 years ago

@victor-ab @huyhoang17 We have released the table structure reconstruction pre-trained models. Please check it. We will release text line recognition model soon.

Is there will be updating text line detection model?

I found it from r"TableMASTER-mmocr/configs/textdet/psenet/" thanks

victor-ab commented 2 years ago

@JiaquanYe

Where can I find the config file for the checkpoint available in google drive?

Is it this one? TableMASTER-mmocr/configs/textrecog/master/table_master_lmdb_ResnetExtract_Ranger_0930.py?

JiaquanYe commented 2 years ago

@JiaquanYe

Where can I find the config file for the checkpoint available in google drive?

Is it this one? TableMASTER-mmocr/configs/textrecog/master/table_master_lmdb_ResnetExtract_Ranger_0930.py?

I train this model with config 'table_master_ResnetExtract_Ranger_0705.py' But it is ok, just ensure the model setting in config is the same with 'table_master_ResnetExtract_Ranger_0705.py'.

victor-ab commented 2 years ago

The end2end predict worked flawlessly.

I used the table_master_ResnetExtract_Ranger_0705.py, but seems like it is not the one:

>>> runner.init_structure_master();
Use load_from_local loader
The model and loaded state dict do not match exactly

size mismatch for backbone.conv2.weight: copying a param with shape torch.Size([64, 64, 3, 3]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for backbone.bn2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
...

Besides that, the structure_predict doesn't seems right:

Single file in structure master prediction ...
{'text': '<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<SOS>,<eb9></eb9>,<SOS>, rowspan="8", rowspan="8", rowspan="8", rowspan="8", rowspan="8",<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<SOS>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,</tr>, rowspan="8",<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>,<eb9></eb9>', 'score': 0.18746569614388986, 'bbox': array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       ...,
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])}

The bboxes are all zeros