DevashishPrasad / CascadeTabNet

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"
MIT License
1.46k stars 422 forks source link

Results are not reproducible on custom datasets - Issues related to mmdetection, CascadeTabNet training on custom dataset #164

Open inkarar opened 2 years ago

inkarar commented 2 years ago

Need better and detailed step by step documentation on how to TRAIN CascadeTabNet on custom datasets to replicate the results produced by CascadeTabNet There are tons of issues related to training this model and mmdetection and the relative paths to config & datasets mentioned in code are a huge mess. Tree/Directory structure should have been mentioned and file names/paths should have been clearly specified.

Here's what I did:

_1. Annotated custom datasets using labeleme and converted these annotations to COCOjson and VOC format

  1. Trained mmdetection on custom dataset using the blog mentioned in the repo's readme file (unsure which architecture to use)
  2. But I'm unsure which config, epoch, .pth files to use for inference/testing the results from model
  3. I'm unsure how to use the code from this repo and the trained model from mmdetection to get the desired results._

Here's what I'm asking:

_1. Please include a readme file mentioned exact steps to follow to replicate results on custome datasets

  1. Please mention which config/.pth file to use where and how to use this repo: which files to execute when/where.
  2. How to format dataset: this repo needs COCO json but mmdetection needs VOC_

The code written is neat and documented and the results achieved are commendable.

Help me out to replicate the results on custom dataset.

@DevashishPrasad @AyanGadpal @kshitijkapadni @ManishDV @francescoperessini @mhmd-azeez @akadirpamukcu @MrZilinXiao @mfproto @iiLaurens @NISH1001

NISH1001 commented 2 years ago

@inkarar The best way to approach training the cascadetabnet is to go through mmdetection framework where we treat each table as an object. So, once you are able to get the correct training annotation format, it's pretty straightforward for training/inference here for table detection.

I haven't actually tried benchmarking it w.r.t the original paper (didn't get time to do it) but had done custom training with the custom dataset I had. It was reasonably good for detection (table only, no cells). The vanilla configuration did struggle with tables with very small heights (for instance, tables with a single row). So, I had to change the anchor box scale in the config and it worked. In fact, I used the exact change of scale to train the header region and it was pretty good.

So, I recommend you try mmdet first for training/inference with its train detector. After that it's pretty straightforward.