The difference of results on the SER of FUNSD between the Table 2 and the Table 6 in the paper.

jpWang / LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding (ACL 2022)

MIT License

335 stars 40 forks source link

The difference of results on the SER of FUNSD between the Table 2 and the Table 6 in the paper. #15

Closed WenjinW closed 1 year ago

WenjinW commented 2 years ago

Hi! I like the idea of decoupling text and layout information to leverage existing pre-trained language models. I had some confusion when I was reading the paper.

Table 2 shows that the F1 of LiLT[InfoXLM] on the SER task of FUNSD is 0.8586.
However, in Table 6, the F1 is 0.8415.

Why are the performances reported in the two tables different?

Thanks for your reply.

leitouran commented 2 years ago

Hi, I had the same confusion but I think the results you mention for Table 2 are for the multilingual model, while the ones in Table 3 are English only.

jpWang commented 2 years ago

Hi, @WenjinW @leitouran , Table 2 follows the fine-tune style of https://github.com/microsoft/unilm/blob/master/layoutlmft/examples/run_funsd.py, but Table 3 follows the fine-tune style of https://github.com/microsoft/unilm/blob/master/layoutlmft/examples/run_xfun_ser.py.