poloclub / unitable

UniTable: Towards a Unified Table Foundation Model
https://arxiv.org/abs/2403.04822
MIT License
381 stars 28 forks source link

The structure recognition of UniTable that I trained is supposed to output <td> [] </td> after inference, but it only shows <td> </td> #30

Closed YongZ-Lee closed 1 month ago

YongZ-Lee commented 2 months ago

Thank you very much for your work. I trained both the base and large versions of UniTable using the complete PubTabNet dataset, and I wanted to test my own model weights in the inference framework. However, I found that where the content was supposed to show '<td>[]</td>', it now only shows '<td></td>'. The following is an example illustrating the inference result:

['<thead>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</thead>', '<tbody>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</tbody>']

PMC3907710_006_00

But the actual required result should be as follows:

['<thead>', '<tr>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '</tr>', '</thead>', '<tbody>', '<tr>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '</tr>', '<tr>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '</tr>', '<tr>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '<td>[]</td>', '</tr>', '</tbody>']

The following are the training settings I used:

TRAIN_pub_html := $(VOCAB_HTML) \
    $(PUBTABNET) $(LABEL_HTML) $(AUG_RESIZE_NORM) \
    $(TRAINER_TABLE) $(I448) $(SEQ512) \
    $(EPOCH24) $(OPT_ADAMW) $(OPT_WD5e2) $(LR_8e5)

EXP_ssp_2m_pub_html_base := $(TRAIN_pub_html) $(ARCH_BASE) \
    $(WEIGHTS_mtim_2m_base) $(LOCK_MTIM_4) $(BATCH32) $(LR_cosine216k_warm27k)

Is there an issue with my configuration file, or is it the inference settings that are wrong? Below are my configurations, which I have checked and they all correspond to the training settings.

# UniTable base model
d_model = 512
patch_size = 16
nhead = 8
dropout = 0.2
encoder = Encoder(
    d_model=d_model,
    nhead=nhead,
    dropout = dropout,
    activation="gelu",
    norm_first=True,
    nlayer=4,
    ff_ratio=4,
)
wtgwuhoo commented 2 months ago

I have the same problem.

wtgwuhoo commented 2 months ago

In the jsonl annotation file, square brackets '[]' need to be added, like this: "[', ']".This format is found in the 'mini_pubtabnet_examples.jsonl' file. However, the annotation file in PubTabNet does not include this format. Should I add them myself, or is there a place where I can download the annotated file? If I need to add them myself, how can I distinguish between empty and non-empty cells?

ShengYun-Peng commented 1 month ago

Thank you both for the question. UniTable marks an empty cell as <td></td>, and a non-empty cell as <td>[]</td>. Check the vocab here for all special tokens.

ShengYun-Peng commented 1 month ago

Feel free to reopen if you still have questions