-
### Feature request / 功能建议
Hi GLM-4 Team,
Thanks for your great work and the powerful GLM-4 models. I'm currently conducting research on efficient long-context LLMs inference and am trying to imp…
-
### System Info
torch==2.2.2
transformers==4.39.3
Platform: RTX 4090 or RTX A6000 rent on vast.ai
### Who can help?
@ArthurZucker @gan
### Information
- [ ] The official example sc…
-
Hi, I'm trying to train a model from scratch as I want it to generate text in another language (Swedish).
My trainingdata is a large collection of novels, about 22 000 that are all in one single .txt…
-
In the [Model Zoo](https://github.com/facebookresearch/vissl/blob/main/MODEL_ZOO.md), an accuracy of 83.38 is reported. However, the experiment configuration is not shared in the json file. I try to r…
-
### Ticket Contents
Belongg is developing BelonggAI, a tool that will help development practitioners, researchers, funders, etc analyze their proposals, program documents, policy documents, etc to …
-
I have adopted a fresh installation, encountering the same issue. I've already spent three days trying to resolve it. So far, none of the methods I've tried have worked, and I also feel like the spe…
-
Hey,
I had used unsloth for faster finetuning of gemma2 9B, with default configuration as suggested by unsloth.
Here’s the public colab of same https://colab.research.google.com/drive/1vIrqH5…
-
训练的时候,发现启用不同的attn implement, 对loss有比较大的影响
定位发现,同样的q,k,v 输入值,flash attn 和core attn的输出相差较大,
且attn_mask为false的token, flash attn输出的是全0向量,但core attn输出的是一个正常向量,
不算这个0向量的话,flash attn和core attn输出的ou…
-
### System Info / 系統信息
win10
### Who can help? / 谁可以帮助到您?
_No response_
### Information / 问题信息
- [X] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
### Reprod…
-
I would like to raise topic and hear your opinions about status of remote sensing in R. Do you also have the impression that there has been a technological leap recently, but the available teaching ma…
kadyb updated
9 months ago