namtuanly / MTL-TabNet

MTL-TabNet: Multi-task Learning based Model for Image-based Table Recognition
Apache License 2.0
88 stars 12 forks source link

AttributeError: 'AttnConvertor' object has no attribute 'num_classes_cell' #16

Open rudra0713 opened 6 months ago

rudra0713 commented 6 months ago

Hi,

I am trying to execute the demo/ocr_image_demo.py script, but I am facing this error:

/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmdet/apis/inference.py:47: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default.
  warnings.warn('Class names are not saved in the checkpoint\'s '
Traceback (most recent call last):
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 51, in build_from_cfg
    return obj_cls(**args)
  File "/scratch/rrs99/MTL-TabNet/mmocr/models/textrecog/recognizer/encode_decode_recognizer.py", line 52, in __init__
    decoder.update(num_classes_cell=self.label_convertor.num_classes_cell())
AttributeError: 'AttnConvertor' object has no attribute 'num_classes_cell'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "demo/ocr_image_demo.py", line 151, in <module>
    main()
  File "demo/ocr_image_demo.py", line 128, in main
    recog_model = init_detector(
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmdet/apis/inference.py", line 39, in init_detector
    model = build_detector(config.model, test_cfg=config.get('test_cfg'))
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmdet/models/builder.py", line 57, in build_detector
    return DETECTORS.build(
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 210, in build
    return self.build_func(*args, **kwargs, registry=self)
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 26, in build_model_from_cfg
    return build_from_cfg(cfg, registry, default_args)
  File "/scratch/rrs99/MTL-TabNet/mtl_tabnet_venv_2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 54, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
AttributeError: SARNet: 'AttnConvertor' object has no attribute 'num_classes_cell'

In my virtual environment, I have installed the following packages:

Python: 3.8.10
mmdet: 2.11.0
mmocr: 0.2.0
mmcv-full: 1.3.4
torch: 1.9.0

Can you kindly help me solve this? Thanks in advance.

namtuanly commented 6 months ago

Hi, Thank you for your interest in our work. You have run the different script. To run demo for recognizing a table image, please run the following script (you can change the input file and checkpoint file in demo.py): python ./table_recognition/demo/demo.py

rudra0713 commented 6 months ago

Hi @namtuanly, thank you so much for your response. I really appreciate it. However, I am facing some difficulties understanding the config change. According to the guideline, I need to make the following changes:

Modify the config and checkpoint path in ArgumentParser.

You can find config in folder /TableMASTER-mmocr/configs/ :

Textline detection (PSENet) config : psenet_r50_fpnf_600e_pubtabnet.py

Textline recognition (MASTER) config : master_lmdb_ResnetExtra_tableRec_dataset_dynamic_mmfp16.py

Table structure (TableMASTER) config : table_master_ResnetExtract_Ranger_0705.py

But, in the table_recognition/demo/demo.py, I can only set the following three arguments and I have set them accordingly:

    parser.add_argument('--tablemaster_config', type=str,
                        default='/scratch/rrs99/MTL-TabNet/configs/textrecog/master/table_master_ResnetExtract_Ranger_0705.py',
                        help='tablemaster config file')
    parser.add_argument('--tablemaster_checkpoint', type=str,
                        default='/scratch/rrs99/MTL-TabNet/table_recognition/demo/PubTabNet/PubTabNet/epoch_19.pth',
                        help='tablemaster checkpoint file')
    parser.add_argument('--out_dir',
                        type=str, default='/scratch/rrs99/MTL-TabNet/table_recognition/demo/outputs/', help='Dir to save results')

I am unsure in which file(s), I should set the Textline detection path and Textline recognition path. Also, when I ran the code with demp.py, I got an error:

FileNotFoundError: [Errno 2] No such file or directory: './tools/data/alphabet/structure_alphabet.txt'
Exception ignored in: <function _TemporaryFileCloser.__del__ at 0x2b3ba1f185e0>

Can you kindly guide me regarding how to resolve this?

namtuanly commented 6 months ago

Hi @rudra0713 ,

Our model (MTL-TabNet) consists of three main components for three sub-tasks of table recognition: cell detection, cell content recognition and table structure recognition. So you don"t need to add the config file of PSENet, MASTER and TableMASTER. Just add the config of MTL-TabNet as following:

parser.add_argument('--tablemaster_config', type=str,
                    default='./configs/textrecog/master/table_master_ResnetExtract_Ranger_0705_cell150_batch4.py',
                    help='tablemaster config file')
rudra0713 commented 6 months ago

Hi @namtuanly, thanks for your suggestion. With your suggested change, the code worked fine on the sample image. I also inspected the generated txt file with the corresponding HTML code and it looked fine. However, when I tried it on one of my images, I was surprised with the result. I am attaching my test image, along with the txt file and the pred_bboxes image file. The pred boxes image file shows a lot of bounding boxes where there is no token, which I can understand because the bbox prediction model may not be 100% accurate. However, looking at the Txt file, I noticed that the captured tokens are very different from what is present in my test image. This makes me wonder, whether there is an issue with the vocabulary file. page_107 page_107.txt

I am attaching all three files here. page_107_pred_bbox

namtuanly commented 6 months ago

Hi @rudra0713 Our model can work on the cropped table image not on the whole document. For your image, you first need to detect the table and then crop it before applying our model. You can use the table detection model in (https://paperswithcode.com/task/table-detection) for detecting and extracting the tables.