-
### Feature request
https://arxiv.org/pdf/2401.01325.pdf
Abstract
This work elicits LLMs’ inherent ability to handle
long contexts without fine-tuning. The limited
length of the training sequen…
-
https://github.com/datamllab/LongLM
-
They have implemented LongRope patch for llama and mistral. Is it possible to port same into ligGPT ?
https://github.com/datamllab/LongLM/tree/master
-
- [ ] [HongyeJ on X: "Despite the mixed feelings about Google's latest Gemma model, we're big fans! @GoogleAI Why? Coz we found it pairs incredibly well with our SelfExtend 🤣🤣🤣 - like, perfectly! With…
-
Hi, thank you for the great work!
I'm wondering if you have plans to provide this for the Solar-Models as well?
olsn updated
5 months ago
-
你好,测试LongLM模型,在填空任务的输出正确:
```
两个多小时后,石峰把四百五十块灵石,花了一个干干净净,而狼城各大药材店中,丹的药材,也被石峰搜罗一空。回到住处,没有片刻停歇,人极境界用到的回气丹,这种丹药,对于石峰来说,没有一炉炉丹药不断出炉,石峰和小黑的脸了花。
```
可以得到正确输出:
```
▁炼石峰和小黑药都得到了什么用处。色都变了,脸上都绽放
```
但是,…
-
Hi team, I checked the locallama and found that gemma can work well with the Self-Extend method. It would be awesome if this technique could be added to the gemma.cpp.
References:
- [locallama](http…
-
In the paper [LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning](https://arxiv.org/pdf/2401.01325.pdf), the authors describe a method to extend the context-window of _any rope-based_ mod…
-
# Trending repositories for C#
1. [**microsoft / PowerToys**](https://github.com/microsoft/PowerToys)
__Windows system utilities to maximize productivity__
218 stars toda…
-
在运行gen.sh 脚本的时候报错,模型使用的是longlm-small, 指向命令
`
gen = model.generate(input_ids, do_sample=True, max_length=512, top_k=40, temperature=0.7, decoder_start_token_id=1)`
如果将这一行替换成
`gen = model.generat…