FlagOpen / FlagEmbedding

Retrieval and Retrieval-augmented LLMs
MIT License
7.16k stars 521 forks source link

activation beacon Needle In A Haystack test failed #422

Open gradetwo opened 8 months ago

gradetwo commented 8 months ago

只有最后4k插入needle可以通过,前面插入全部失败。 测试输入长度16k,32k,128k.

Needle In A Haystack 下面是29.5k长度测试输出

`**** Input Length: 29605 Position: 0k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  1. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 1k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge. The Golden Gate Bridge is a famous suspension bridge that connects San Francisco to Marin County. It is an


    Input Length: 29606 Position: 2k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Park. It is a beautiful park with many attractions, including the Japanese Tea Garden, the California Academy of Sciences,


    Input Length: 29605 Position: 3k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  2. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29604 Position: 4k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  3. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29607 Position: 5k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29606 Position: 6k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  1. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29606 Position: 7k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  2. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 8k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  3. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 9k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  4. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 10k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  5. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 11k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  6. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29606 Position: 12k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  7. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29606 Position: 13k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29607 Position: 14k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29605 Position: 15k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  1. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 16k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  2. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29605 Position: 17k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  3. Visit Alcatraz Island: Take a ferry to the former prison and explore the island


    Input Length: 29607 Position: 18k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29606 Position: 19k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29606 Position: 20k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  1. Visit Alcatraz Island: Take a ferry to the former prison and explore the island

    Input Length: 29607 Position: 21k Prediction: The best thing to do in San Francisco is to visit the Golden Gate Bridge, Alcatraz Island, and Fisherman's Wharf.

The Golden Gate Bridge is an iconic


Input Length: 29606 Position: 22k Prediction: There are many great things to do in San Francisco! Here are some suggestions:

  1. Visit Alcatraz Island: Take a ferry to the former prison and explore the island `
namespace-Pt commented 8 months ago

Hi, 请问方便分享一下你的测试脚本么?我看原git仓库里只有openai和anthropic的。

gradetwo commented 8 months ago

@namespace-Pt 临时手工测试脚本 test_needle.py

namespace-Pt commented 8 months ago

data/book/dinosaurs.txt方便也分享一下不

gradetwo commented 8 months ago

data/book/dinosaurs.txt方便也分享一下不

这个版权类书籍,你随便找一本大一点的英文书测试。

gradetwo commented 8 months ago

我测试了几种llama的prompts,现象是一致的。

unitxt构造prompts

namespace-Pt commented 8 months ago

Hi, 我按照脚本进行了测试:

  1. Activation Beacon在8K上大部分时间能成功找回needle,但是32K基本全部失败,这是因为越长的context需要更大的压缩率(8K->2, 32K->8),而压缩会带来信息的损失,因此当context变长后复现needle任务上表现很差,这和我们在topic retrieval/passkey retrieval上的结论是一致的。但是,值得注意的是模型失败时竟然会输出完全一样的结果,这一点我们会继续深入研究,之前没有注意到这个问题。

  2. Activation Beacon可以和检索配合从而增强其在这种高精度记忆类任务上的表现,目前其能够支持简单的bm25检索,即通过bm25确定3个interval,这些interval中的内容使用较低压缩率(2),其余内容使用较高压缩率(128),这种方法可以改善Activation Beacon在needle in a haystack上的表现,基本能够保证80%成功率。我用了PG19 test上第一本书,代码在这里,请你尝试。

  3. Activation Beacon仅是我们对于长文本的一个初步尝试,验证了压缩的可行性,之后我们会将其与更加精巧的检索机制结合在一起,形成系统的长文本解决方案,请你继续关注并提出宝贵意见。