Question on fine-tuning TATR with a proprietary dataset

Hi!

I am trying to fine tune the TATR model with a proprietary dataset. I am currently trying to convert the dataset to the same format as FinTabNet and then using the script in this repository (scripts/process_fintabnet.py) to transform that into the Pascal VOC format required by TATR.

I am then training it using the main.py file in this repo, with just one change - loading the tatr table detection (microsoft/table-transformer-detection, revision="no_timm") and tatr table structure recognition model (microsoft/table-transformer-structure-recognition-v1.1-all) from hugging face, instead of the DETR model that is built in this repository.

When I train it on the DETR model that is in this repository, I am able to train it without any issue.

The command I run to train is - python3 main.py --data_root_dir <data directory> --config_file structure_config.json

I am getting this error - tatr training error

Details on runtime env -

Cuda version - 12.2
Nvidia driver version - 535.154.05
torch version - 1.13.1
torch audio version - 0.13.1
torch vision version - 0.14.1
transformers version - 4.38.0.dev0

Would be great if anybody could help me with this!
Thanks, Srivatsan.

microsoft / table-transformer

Question on fine-tuning TATR with a proprietary dataset #169