-
https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
https://arxiv.org/…
-
### Motivation.
Nowadays, many new applications including multi-turn conversations, multi-modality and multi-agent, require a significant amount of KV cache. Such applications generally have a shared…
-
### Describe the issue
Hello,
Vertical lines in attention correspond to "heavy hitters", tokens that are attended every time.
I don't really get what the intuition behind the off diagonal line…
-
**Describe the bug**
Google Scholar cannot recognize the publication date of the papers on OpenReview. As a result, some papers on OpenReview will not be indexed by Google Scholar at all, and other p…
-
Is the evaluation metric public?
Please share how the evaluation metric is computed
-
Hi Yongcheng,
I hope this message finds you well.
I have a question regarding the KL divergence presented in Figure 3(a) of your ICML 2024 paper on Token-level Direct Preference Optimization. …
-
### News
- HyperCLOVA X 공개 (8.24)
- 네이버클라우드 소개페이지: https://www.ncloud.com/solution/featured/hyperclovax
- DAN23 영상 다시보기: https://tv.naver.com/v/39568301
- [ChatGPT-3.5 Tuning and Enterprise](h…
-
Even with "is_poison"=false, the accuracy is only about 10%. When "is_poison"=true and batch_size=264, I get the results as follows:
When there are adversaries, accuracy of backdoor is about 100%,ac…
-
The prepare program of mlebench requires me to accpet the competition rules of 'detecting-insults-in-social-commentary' on Kaggle.
However the late submission button on Kaggle of 'detecting-insults…
-
Inspired by a recent back and forth with @gau-nernst we should add some quantized training recipes in AO for small models (600M param range)
Character.ai recently shared that they're working on qua…