Stuck trying to run main.py. All help gratefully accepted...

Hello all:

I have been trying to use the Microsoft Table Transformer (as exists in Github) to detect and extract (Tables and Cells) in TIFF files along with the text that exists inside the cells. For that, we have used a machine that has the following config:

HP Laptop running WSL2
Docker

Using the Docker (and through using a Dockerfile, etc), we have created a container where we have setup the Table Transformer. When we run the Inference.py program, we get cells. spanning cells, tables detected BUT many of them are erroneous. Sharing an input file and detected cells here.

Aluminium armoured cable - AMNS SMP

Then we explored the need to train the model for the types of tables that we shall be encountering. For that we started using the main.py with the following command.

python main.py --data_type detection --config_file detection_config.json --data_root_dir "/app/Programmer_Content/Input Image Files" --device cpu

We are encountering the following error.


{'lr': 5e-05, 'lr_backbone': 1e-05, 'batch_size': 2, 'weight_decay': 0.0001, 'epochs': 20, 'lr_drop': 1, 'lr_gamma': 0.9, 'clip_max_norm': 0.1, 'backbone': 'resnet18', 'num_classes': 2, 'dilation': False, 'position_embedding': 'sine', 'emphasized_weights': {}, 'enc_layers': 6, 'dec_layers': 6, 'dim_feedforward': 2048, 'hidden_dim': 256, 'dropout': 0.1, 'nheads': 8, 'num_queries': 15, 'pre_norm': True, 'masks': False, 'aux_loss': False, 'mask_loss_coef': 1, 'dice_loss_coef': 1, 'ce_loss_coef': 1, 'bbox_loss_coef': 5, 'giou_loss_coef': 2, 'eos_coef': 0.4, 'set_cost_class': 1, 'set_cost_bbox': 5, 'set_cost_giou': 2, 'device': 'cpu', 'seed': 42, 'start_epoch': 0, 'num_workers': 1, 'data_root_dir': '/app/Programmer_Content/Input Image Files', 'config_file': 'detection_config.json', 'data_type': 'detection', 'model_load_path': None, 'load_weights_only': False, 'model_save_dir': None, 'metrics_save_filepath': '', 'debug_save_dir': 'debug', 'table_words_dir': None, 'mode': 'train', 'debug': False, 'checkpoint_freq': 1, 'train_max_size': None, 'val_max_size': None, 'test_max_size': None, 'eval_pool_size': 1, 'eval_step': 1, '__module__': '__main__', '__dict__': <attribute '__dict__' of 'Args' objects>, '__weakref__': <attribute '__weakref__' of 'Args' objects>, '__doc__': None}
----------------------------------------------------------------------------------------------------
loading model
/root/miniconda3/envs/tables-detr/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/root/miniconda3/envs/tables-detr/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=ResNet18_Weights.IMAGENET1K_V1`. You can also use `weights=ResNet18_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
loading data
loading data
creating index...
index created!
Traceback (most recent call last):
  File "/app/Programmer_Content/table-transformer/src/main.py", line 375, in <module>
    main()
  File "/app/Programmer_Content/table-transformer/src/main.py", line 368, in main
    train(args, model, criterion, postprocessors, device)
  File "/app/Programmer_Content/table-transformer/src/main.py", line 214, in train
    data_loader_train, data_loader_val, dataset_val, train_len = get_data(args)
  File "/app/Programmer_Content/table-transformer/src/main.py", line 130, in get_data
    sampler_train = torch.utils.data.RandomSampler(dataset_train)
  File "/root/miniconda3/envs/tables-detr/lib/python3.10/site-packages/torch/utils/data/sampler.py", line 107, in __init__
    raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

We have run out of ideas how to get the main.py rogram to work. Following is the directory structure that we have:

/app/Programmer_Content/table-transformer/src (This is where we run the main.py program)
/app/Programmer_Content/Input Image Files/images (Has 10 JPG files)
/app/Programmer_Content/Input Image Files/train (Directory is empty)
/app/Programmer_Content/Input Image Files/val (Directory is empty)

If someone can guide us as to how to either get the inference.py to recognize more accurately OR main.py to train using the directory structure that I have mentioned, we shall be grateful.

Also, I would like to mention that if the documentation were to become simpler and richer, it will be helpful to a vastly larger number of people. (This is a suggestion). The current documentaion (as in to do steps for errors encountered) is possibly inadequate for people like me ( 4 weeks back, I was a ZERO at Python programming, AI, Github , etc. :) )

microsoft / table-transformer

Stuck trying to run main.py. All help gratefully accepted... #125