-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
### Exp…
-
# 行业角度看LLM
通向AGI之路:大型语言模型(LLM)技术精要
# 大模型有哪些
https://zhuanlan.zhihu.com/p/611403556
# 模型结构
为什么现在的LLM都是Decoder only的架构?
lowrank角度
# 如何训练
[Ladder Side-Tuning:预训练模型的“过墙梯”](https://kexue.f…
-
### Type
new chapter
### Chapter/Page
Something else
### Description
Doing training or inference models are fairly easy, when we have smaller number of parameters. But when the scale of…
-
Hi!
Let's bring the documentation to all the Korean-speaking community 🌏 (currently 9 out of 77 complete)
Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com…
-
The deaf person will also try to speak sometimes. We need a model to figure out if the transcription is poor quality or not and if it is below a particular threshold we should not show it.
-
hi @TimDettmers .
The paper shows that you quantize the weights to 2/4 bits using NF format. I wonder how to you handle the input activations (denoted as x). Is x also quantized to 2/4 bits?
If…
-
以ChatGLM为基座,增加自己的语料库进行预训练,不是微调,这个该怎么做?
-
Ollama - local models on your machine
https://youtu.be/Ox8hhpgrUi0?si=LxpAd1n29InncB78
Open-weight models
- Llama3
- Mistral 7B v0.3
Use cases:
- interactive vs non-intersecting
- local RAG…
-
# URL
- https://arxiv.org/abs/2310.16789
# Affiliations
- Weijia Shi, N/A
- Anirudh Ajith, N/A
- Mengzhou Xia, N/A
- Yangsibo Huang, N/A
- Daogao Liu, N/A
- Terra Blevins, N/A
- Danqi Ch…
-
Thanks for the excellent work! could you please release the training code when you are available?