Hello, In the origin paper, the author wrote "We provide e(ci, ri) as a prefix instead of inserting it at position i because M is not yet finetuned on any examples containing API calls, so inserting it in the middle of x would interrupt the flow and not align with patterns in the pretraining corpus, thus hurting perplexity." on the footnote of page 3.
However I found that u seem to inserting API call in the origin sentence when calculating loss.
Hello, In the origin paper, the author wrote "We provide e(ci, ri) as a prefix instead of inserting it at position i because M is not yet finetuned on any examples containing API calls, so inserting it in the middle of x would interrupt the flow and not align with patterns in the pretraining corpus, thus hurting perplexity." on the footnote of page 3.
However I found that u seem to inserting API call in the origin sentence when calculating loss.
What may I miss?