Open themez opened 7 months ago
Location in document: S4.SS3.3.6
Selected HTML:
To better illustrate the effectiveness of our framework in generating executable action sequences, we compare the performance of the COT, Reflexion, and our framework, while giving the golden label of the instruction. By offering the same extraction targets, we can effectively detect the actual effects of different frameworks on generating action sequences.
Table 4.3 shows experimental results, from which we can have the following observations: 1) Our proposed progressive understanding framework still effectively enhances the model’s performance under this setting; 2) LLMs still suffer in accurately understanding web page contents with semi-structured markup languages, which illustrate the performance gap between Table 3.1 and Table 4.3; 3) Compared to closed-source LLMs, even provided with golden labels, Open-source LLMs are unable to achieve sustained performance improvement. This phenomenon demonstrates that the bottleneck for these models lies not in understanding the webpage content but in understanding the webpage’s hierarchical structure itself.
Hello @themez, thanks for the issue report! We are reviewing your report and will address it as soon as possible.
Description
table description float over table
(Optional:) Please add any files, screenshots, or other information here.
No response
(Required) What is this issue most closely related to? Select one.
Choose One
Internal issue ID
b21e9f52-30d2-4cb0-bb7c-74b276ac1cb9
Paper URL
https://arxiv.org/html/2404.12753v1
Browser
Chrome/124.0.0.0
Device Type
macbook M1