DevashishPrasad / CascadeTabNet

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"
MIT License
1.46k stars 422 forks source link

Testing with my scanned documents. #90

Open zokai opened 3 years ago

zokai commented 3 years ago

I would like to try CascadeTabNet with my own images. I used main.py and set the followings path:

image_path xmlPath config_fname

However I don't know what to set for these variables: checkpoint_path = "path to checkpoint directory" epoch = 'epoch_file.name'

Does your github repository come with a pre-trained model based on your dataset? or I have to train your model locally?

Thanks

cabudies commented 3 years ago

Hi Zokai,

Would suggest point checkpoint to "cascade_mask_rcnn_hrnetv2p_w32_20e.py" and epoch to "epoch_36.pth"

Kk-ships commented 3 years ago

I agree with cabudies answer and would also recommend you to go through ~[Model Zoo of CascadeTabNet] (https://github.com/DevashishPrasad/CascadeTabNet#6-model-zoo). Model zoo includes different checkpoint files along with dataset on which they were trained.

epoch_36.pth

This model was trained on a highly specific dataset. I would recommend to start with a model trained on more general dataset such as

epoch_24.pth

and try different models whichever suits your need.

zokai commented 3 years ago

Thank you.

On Wed, Oct 7, 2020 at 10:54 PM Kaustubh Shirpurkar < notifications@github.com> wrote:

I agree with cabudies answer and would also recommend you to go through ~[Model Zoo of CascadeTabNet] ( https://github.com/DevashishPrasad/CascadeTabNet#6-model-zoo). Model zoo includes different checkpoint files along with dataset on which they were trained.

epoch_36.pth

This model was trained on a highly specific dataset. I would recommend to start with a model trained on more general dataset such as

epoch_24.pth

and try different models whichever suits your need.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/DevashishPrasad/CascadeTabNet/issues/90#issuecomment-705299139, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACE6S2ZL4TLUHLFRDUGCQQTSJUSWXANCNFSM4R7GTG5Q .

VincentJousse commented 3 years ago

Hi, did you succeed in detecting your table structure with the given checkpoint ?

Kk-ships commented 3 years ago

Yes I was able to get table structure out using same checkpoint.

VincentJousse commented 3 years ago

Was it a border or a borderless table ? Did you use Colab ? Did you tweak anything to get good results ?

VincentJousse commented 3 years ago

Could you share your python code ?

Kk-ships commented 3 years ago

I used colab version. Here is the link for colab notebook. I think you need to make some changes while using images. Code is pretty self explanatory. https://colab.research.google.com/drive/16GzDZqfWCf3Kt6_EOk7FKZ7sHTYii-_w?usp=sharing