efficient-transformers Search Results

1000+ results
for efficient-transformers

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

THUDM/GLM-4 #207

Is there any plan to merge the `modeling_chatglm.py` into th…

### Feature request / 功能建议 Hi GLM-4 Team, Thanks for your great work and the powerful GLM-4 models. I'm currently conducting research on efficient long-context LLMs inference and am trying to imp…

iofu728 updated 4 months ago
3
huggingface/transformers #30055

No speed-up of model.generate() with StaticCache + torch.com…

### System Info torch==2.2.2 transformers==4.39.3 Platform: RTX 4090 or RTX A6000 rent on vast.ai ### Who can help? @ArthurZucker @gan ### Information - [ ] The official example sc…

learning-chip updated 1 week ago
17
minimaxir/aitextgen #13

Train on large textfile

Hi, I'm trying to train a model from scratch as I want it to generate text in another language (Swedish). My trainingdata is a large collection of novels, about 22 000 that are all in one single .txt…

ZerxXxes updated 2 years ago
11
facebookresearch/vissl #552

Requests experiment configuration for supervised ViT/B16 + …

In the [Model Zoo](https://github.com/facebookresearch/vissl/blob/main/MODEL_ZOO.md), an accuracy of 83.38 is reported. However, the experiment configuration is not shared in the json file. I try to r…

VicaYang updated 2 years ago
2
BelonggAI/C4GTDMP #1

[DMP 2024]: AI-based Indian language corpus translation tool…

### Ticket Contents Belongg is developing BelonggAI, a tool that will help development practitioners, researchers, funders, etc analyze their proposals, program documents, policy documents, etc to …

BelonggAI updated 3 months ago
21
comfyanonymous/ComfyUI #3265

When deploying ComfyUI on a fresh Windows installation using…

I have adopted a fresh installation, encountering the same issue. I've already spent three days trying to resolve it. So far, none of the methods I've tried have worked, and I also feel like the spe…

wibur0620 updated 1 month ago
15
unslothai/unsloth #818

Inference speed is so slow

Hey, I had used unsloth for faster finetuning of gemma2 9B, with default configuration as suggested by unsloth. Here’s the public colab of same https://colab.research.google.com/drive/1vIrqH5…

rumanxyz updated 5 days ago
3
QwenLM/Qwen2.5 #874

ft qwen2的时候，flash attn 和core attn的输出相差较大，且attn_mask为false的to…

训练的时候，发现启用不同的attn implement, 对loss有比较大的影响定位发现，同样的q,k,v 输入值，flash attn 和core attn的输出相差较大，且attn_mask为false的token, flash attn输出的是全0向量，但core attn输出的是一个正常向量, 不算这个0向量的话，flash attn和core attn输出的ou…

seanM29 updated 3 weeks ago
9
THUDM/CogVLM2 #145

cogvlm2-llama3-chinese-chat-19B-int4, 使用CLI demo，输出报错

### System Info / 系統信息 win10 ### Who can help? / 谁可以帮助到您？ _No response_ ### Information / 问题信息 - [X] The official example scripts / 官方的示例脚本 - [ ] My own modified scripts / 我自己修改的脚本和任务 ### Reprod…

tectal updated 3 months ago
1
r-spatial/discuss #56

Remote sensing in R

I would like to raise topic and hear your opinions about status of remote sensing in R. Do you also have the impression that there has been a technological leap recently, but the available teaching ma…

kadyb updated 9 months ago
12

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for efficient-transformers

1000+ results
for efficient-transformers