arXiv / html_feedback

Supports a student project developing a UI for feedback on arXiv articles rendered as html.
MIT License
14 stars 2 forks source link

Error image content #1505

Open ycxioooong opened 1 week ago

ycxioooong commented 1 week ago

Description

The numbers in this image is wrong, compared with PDF version. This image dilivers misinformation

(Optional:) Please add any files, screenshots, or other information here.

No response

(Required) What is this issue most closely related to? Select one.

Choose One

Internal issue ID

41724592-b5f9-40d8-934f-358ad355fe36

Paper URL

https://arxiv.org/html/2405.18719v1

Browser

Chrome/125.0.0.0

Device Type

Desktop

html-feedback-bot[bot] commented 1 week ago

Location in document: ltx_caption

Selected HTML:

000000000000Contextcurrenttoken at i𝑖iitalic_i

Relative PE

11109876543210Positionpi⁒j=iβˆ’jsubscript𝑝𝑖𝑗𝑖𝑗p_{ij}=i-jitalic_p start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_i - italic_jAttentiongradual decay

CoPE

000000000000Gatesgi⁒j=σ⁒(πͺi⊀⁒𝐀j)subscriptπ‘”π‘–π‘—πœŽsuperscriptsubscriptπͺ𝑖topsubscript𝐀𝑗g_{ij}=\sigma(\mathbf{q}_{i}^{\top}\mathbf{k}_{j})italic_g start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_Οƒ ( bold_q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⊀ end_POSTSUPERSCRIPT bold_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )Position222222222222pi⁒j=βˆ‘k=ji(gi⁒k)subscript𝑝𝑖𝑗superscriptsubscriptπ‘˜π‘—π‘–subscriptπ‘”π‘–π‘˜\displaystyle p_{ij}=\sum_{k=j}^{i}(g_{ik})italic_p start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = βˆ‘ start_POSTSUBSCRIPT italic_k = italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( italic_g start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT )Attentionattend toposition 0
Figure 1: Contextual Position Encoding (CoPE). Standard position encoding methods such as Relative PE are based on token positions. In contrast, CoPE computes gate values conditioned on the context first, then uses that to assign positions to tokens using a cumulative sum. This allows positions to be contextualized, and represent the count of different units like words, verbs or sentences. CoPE operates on each attention head and so can attend to different position types on each. In this example, attending to the last sentence using Relative PE is challenging, and the best it can do is a decaying attention (β€œrecency bias”). CoPE can count the sentence endings and simply attend to position β€œ0”.
github-actions[bot] commented 1 week ago

Hello @ycxioooong, thanks for the issue report! We are reviewing your report and will address it as soon as possible.