DevashishPrasad / CascadeTabNet

This repository contains the code and implementation details of the CascadeTabNet paper "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"
MIT License
1.49k stars 427 forks source link

Questions about the prediction of the model #48

Open StanleyGan opened 4 years ago

StanleyGan commented 4 years ago

The prediction of the model is a list of 80 arrays. Which one represents the cell bounding boxes and which represents the table bounding boxes? I am interested in extracting the vertices for bounding box of table.

kshitijkapadni commented 4 years ago

You can refer main.py. I have extracted the model prediction in that file.

SdwHorizon commented 4 years ago

load the weight(checkpoint_file:epoch_36.pth, config_file=cascade_mask_rcnn_hrnetv2p_w32_20e.py) has error about that: size mismatch for rpn_head.rpn_cls.weight: copying a param with shape torch.Size([3, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([9, 256, 1, 1]). size mismatch for rpn_head.rpn_cls.bias: copying a param with shape torch.Size([3]) from checkpoint, the shape in current model is torch.Size([9]). Are they know why? Thanks!

stemgene commented 4 years ago

@StanleyGan I think the array with 5 numbers is the boundary box of cell or table, and the first 4 numbers represent the coordinate of box. You can either plot them on your test image or calculate the location manually.