microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.01k stars 231 forks source link

Evaluation Script Freeze for Fine-tune Table Structure Recognition Model #172

Open rusubbiz-muzkaq opened 3 months ago

rusubbiz-muzkaq commented 3 months ago

I have fine-tune Table Structure Recognition model for with 20 epoch, inference result are also getting but to get evaluation metrics script not giving any result and freeze for forever

Command used to run eval script:

python main.py --mode eval \
               --data_type structure \
               --config_file structure_config.json \
               --data_root_dir /table_transformers_str/test_data \
               --model_load_path /str_model/model_20.pth \
               --table_words_dir /table_transformers_str/test_data/table_words_dir \
               --device gpu \
               --batch_size 16 \
               --debug \
               --debug_save_dir /table_transformers_str/eval_result \
               --test_max_size 8 \
               --metrics_save_filepath /table_transformers_str/eval_result

Folder "test_data" have 2 subdirectory "images" and "test". In "test" xml files are sore with respect to images

Command result:

(transformers) user@RUSUBBIZ:/table-transformer/src$ python main.py --mode eval \
               --data_type structure \
               --config_file structure_config.json \
               --data_root_dir /table_transformers_str/test_data \
               --model_load_path /str_model/model_20.pth \
               --table_words_dir /table_transformers_str/test_data/table_words_dir \
               --device gpu \
               --batch_size 16 \
               --debug \
               --debug_save_dir /table_transformers_str/eval_result \
               --test_max_size 8 \
               --metrics_save_filepath /table_transformers_str/eval_result

{'lr': 5e-05, 'lr_backbone': 1e-05, 'batch_size': 8, 'weight_decay': 0.0001, 'epochs': 1, 'lr_drop': 1, 'lr_gamma': 0.9, 'clip_max_norm': 0.1, 'backbone': 'resnet18', 'num_classes': 6, 'dilation': False, 'position_embedding': 'sine', 'emphasized_weights': {}, 'enc_layers': 6, 'dec_layers': 6, 'dim_feedforward': 2048, 'hidden_dim': 256, 'dropout': 0.1, 'nheads': 8, 'num_queries': 125, 'pre_norm': True, 'masks': False, 'aux_loss': False, 'mask_loss_coef': 1, 'dice_loss_coef': 1, 'ce_loss_coef': 1, 'bbox_loss_coef': 5, 'giou_loss_coef': 2, 'eos_coef': 0.4, 'set_cost_class': 1, 'set_cost_bbox': 5, 'set_cost_giou': 2, 'device': 'cuda', 'seed': 42, 'start_epoch': 0, 'num_workers': 1, 'data_root_dir': '/table_transformers_str/test_data', 'config_file': 'structure_config.json', 'data_type': 'structure', 'model_load_path': '/str_model/model_20.pth', 'load_weights_only': False, 'model_save_dir': None, 'metrics_save_filepath': '/table_transformers_str/eval_result', 'debug_save_dir': '/table_transformers_str/eval_result', 'table_words_dir': '/table_transformers_str/test_data/table_words_dir', 'mode': 'eval', 'debug': True, 'checkpoint_freq': 1, 'train_max_size': None, 'val_max_size': None, 'test_max_size': 8, 'eval_pool_size': 1, 'eval_step': 1, '__module__': '__main__', '__dict__': <attribute '__dict__' of 'Args' objects>, '__weakref__': <attribute '__weakref__' of 'Args' objects>, '__doc__': None}
----------------------------------------------------------------------------------------------------
Running evaluation/inference in DEBUG mode, processing will take longer. Saving output to: /table_transformers_str/eval_result.
loading model
/data/opt/miniconda/envs/transformers/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/data/opt/miniconda/envs/transformers/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
loading model from checkpoint
loading data
creating index...
index created!
/data/opt/miniconda/envs/transformers/lib/python3.9/site-packages/torch/nn/modules/conv.py:456: UserWarning: Applied workaround for CuDNN issue, install nvrtc.so (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:80.)
  return F.conv2d(input, weight, bias, self.stride,

***** After this, script freeze here for infinite time

I am expecting once eval script run will get below metrics but not getting.

Screenshot 2024-03-14 at 9 54 36 PM

In this folder "/table_transformers_str/eval_result" visualisation is storing only: 000101000909008U1UB2023__0_bboxes 000101000909008U1UB2023__0_cells

If any one has faced similar issue while evaluation then please share how to get evaluation metrics for fine-tune model Table Structure Recognition, thanks for advance

ali4friends71 commented 2 months ago

Hi @rusubbiz-muzkaq I also want to fine tune the table structure recognition model on my own dataset. Can you please tell me how should I do it ? And can you please share me the code for it.

I'm fine tuning the model but not getting proper results after fine tuning.

Thanks in Advance.