DevashishPrasad / CascadeTabNet

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

MIT License

1.47k stars 422 forks source link

Not able to reproduce your demo result. #66

Closed kkissmart closed 3 years ago

kkissmart commented 3 years ago

I used cascade_mask_rcnn_hrnetv2p_w32_20e.py as config and epoch_36.pth as model

from mmdet.apis import inference_detector, init_detector, show_result_pyplot

Choose to use a config and initialize the detector

config = 'configs/cascade_table/cascade_mask_rcnn_hrnetv2p_w32_20e.py'

Setup a checkpoint file to load

checkpoint = '/home/model/epoch_36.pth'

initialize the detector

model = init_detector(config, checkpoint, device='cuda:0') img = "/home/code/CascadeTabNet/Demo/demo.png" result = inference_detector(model, img) show_result_pyplot(model, img, result, score_thr=0.85)

the result is a mess. It has tons of tables but none of them are correct.

DevashishPrasad commented 3 years ago

The dataset which we used is smaller and the pre-trained model is also not production ready.

So there would be many scenarios where the model would fail. If your table looks similar to what we have trained on then there should not be any problem. If not, then you need to fine-tune the model for your own data.

kkissmart commented 3 years ago

@DevashishPrasad Thanks for your fast response

I am using your demo png, but I can't reproduce your result.

https://github.com/DevashishPrasad/CascadeTabNet/blob/e3b12122454d4321ff8eb908830528caba6cc48d/Config/cascade_mask_rcnn_hrnetv2p_w32_20e.py#L4

when I load this config, the mmdetection code said no 'num_stages' key recognizable, can you point me where this variable is used in the cascade rnn model in mmdetection codebase?

(https://github.com/open-mmlab/mmdetection/blob/ae453fa92ffebcbd224b72f6d48e0b8699424450/mmdet/models/detectors/cascade_rcnn.py#L10)

So i removed this num_stages=3. then the model can be loaded, but the result is very awful. Do I miss anything?

Thanks a lot!

kkissmart commented 3 years ago

this is what I run and what I got, super far away from what was claimed in the paper.

https://drive.google.com/file/d/1NvUFJUQGlsPqTitML-gw-Uqec6JQWbpE/view?usp=sharing

Please let me know if anything is missing

DevashishPrasad commented 3 years ago

The Demo.png was made manually (using photo editing) just to provide an illustration of the idea. Those are not the actual results of cascadetabnet model.

You can find the actual results on more images in our CVPR paper.

But the results that you are getting on Demo.png are definetly poor and cascadetabnet would do far more better. Most probably you are missing something.