Request for Sharing the Rendering Code

SpursGoZmy / Table-LLaVA

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tabular MLLM named Table-LLaVA.

Apache License 2.0

162 stars 7 forks source link

The rendered table images have 3 styles and each style has different rendering code.

Markdown Style: Read the markdown table data as a DataFrame object with Pandas python package and use dataframe_image python package to convert the dataframe into a image.
HTML Style: Get the html code of the original table or build a html code base on the table data, and use html2image python package to convert the html into a screentshot image. Then you need to write a script to clip the extra white space around the table.
Excel Style: Write the table data into an Excel. Use xlwings package to read the excel file and convert it to a image.

Data Augmentation: during rendering the table image, data augmentations like changing the font type or cell color can be added.

For different datasets, the difficulty of rendering tables into images is different and it indeed needs quite a lot of careful work. We will try to clean up our rendering code for open-source. But we really can not guarantee a very soon DDL. Thanks for your understanding.

SpursGoZmy / Table-LLaVA

Request for Sharing the Rendering Code #10