Open matchalambada opened 2 years ago
Is there any update on that?
How can we extract data in rows/column format from the table image from the trained model?
In the current version of the code, you can find the function that takes the model output and processes it into a table representation here: https://github.com/microsoft/table-transformer/blob/3e1dd0c3cad7956c790765b491ec86817e94ce43/src/grits.py#L727
@bsmock
hello ,I want to know if I used the Functiuon objects_to_cells ,How can I get the page_tokens if I will use a new Image input
How can I get the page_tokens if I will use a new Image input
Right now the code is written to be used with the PubTables-1M dataset or any dataset in the same format. For each table image in PubTables-1M, there is also a JSON file with a list of words in the image, which is read in as page_tokens
. So the input image and the list of words (page_tokens
) are what you need for inference.
You can have a look at the dataset to see examples of the format for page_tokens
. Basically page_tokens
needs to be a list of dicts, where each dict corresponds to a word or token and looks like this:
{"text": "Table", "bbox": [xmin, ymin, xmax, ymax], "flags": 0, "block_num": 0, "line_num": 0, "span_num": 0}
At a minimum you'll need to fill in the "text"
, "bbox"
, and "span_num"
fields, where "span_num"
is an integer that puts the words in some order. When the code returns the text for each cell as a string, the words in the text string will be sorted by "block_num"
, then "line_num"
, then "span_num"
. So you can leave "flags"
, "block_num"
, and "line_num"
as 0 as long as you put a unique integer for each word in "span_num"
.
@bsmock , Can you please add at least one example image with all the required data structures to make a working inference example? It would help to understand the format without downloading 110Gb of data. Thank you!
@bsmock , Can you please add at least one example image with all the required data structures to make a working inference example? It would help to understand the format without downloading 110Gb of data. Thank you!
You can find some samples from here: https://drive.google.com/drive/folders/0B5h08T2mGP3ffnZLbTZ0WVNRT3Zjdjl2eC11aW0tOFVCaU5Mb2c2Q0dmc21lNWo1Y3BuT3c?resourcekey=0-bphHgPyZKg0yT5V8F7BWjw&usp=sharing
@bsmock , Can you please add at least one example image with all the required data structures to make a working inference example? It would help to understand the format without downloading 110Gb of data. Thank you!
You can find some samples from here: https://drive.google.com/drive/folders/0B5h08T2mGP3ffnZLbTZ0WVNRT3Zjdjl2eC11aW0tOFVCaU5Mb2c2Q0dmc21lNWo1Y3BuT3c?resourcekey=0-bphHgPyZKg0yT5V8F7BWjw&usp=sharing
Thank you, @suonbo , but in this location I can only see the .jpg images (and they are cropped tables, not whole pages). I am looking for example with data required in the inference example:
python main.py --mode eval --data_type structure --config_file structure_config.json --data_root_dir /path/to/pascal_voc_structure_data --model_load_path /path/to/structure_model --table_words_dir /path/to/json_table_words_data
specifically, I need the config file (not in the repo!), pascal_voc_structure, table_words_dir (what's there?), json_table_words_data ...
To anyone interested, I uploaded an example of the table structure recognition files here. It holds the annotation (pascal voc), the words (json) and the table image (.jpg)
Has anyone figured how to run table detection alone ?
Has anyone figured how to run table detection alone ?
NielsRogge made a notebook with examples
NielsRogge made a notebook with examples
Can you share some tutorial where the table is converted to csv or html?
Has anyone figured how to run table detection alone ?
NielsRogge made a notebook with examples
Hello, thank you for providing a simple case study. I encountered an issue while running jupyter notebook. There is a dependency on resnet18 in the Microsoft/table transformer detection configuration, but I failed to download using the third-party Python library timm. Do you have any method to make table transformer detection load the local resnet18 configuration?
Hi,
See #158 with updated notebooks and demos
Hi authors, I would like to visualize the result table detection for an specific Image. Which output in the code should I take out and modify in order to have coordianates of predicted bouding box to visualize it on the infered image?