attention-model Search Results

1000+ results
for attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

UKPLab/sentence-transformers #2694

When the model is on the CPU, the device of the tensor retur…

Let's look at the code directly: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("all-MiniLM-L6-v2").to('cpu').eval() model.encode(['test'], convert…

secsilm updated 1 month ago
5
unslothai/unsloth #728

Batch Generation Error

Hey everyone! It should function finally! Please update Unsloth via (if you're on a local machine - Colab / Kaggle no need to update, just refresh) > > ```shell > pip uninstall unsloth -y > pip i…

ChenKy23 updated 2 days ago
4
G-U-N/Phased-Consistency-Model #2

[Inference Issue] ValueError when trying to load LoRA weight…

Hey! Congrats on you work, and thanks a lot of sharing it 🤗 When trying to use the sd1.5 and sdxl checkpoints on the hub for inference with `diffusers`, I got this following error when calling `lo…

linoytsaban updated 1 month ago
19
pytorch/torchtitan #421

LoRA fine-tuning weights explosion in FSDP training

Dear authors, I encountered weights explosion problems during integrating LoRA to torchtitan. I am running with train_configs/llama3_8b.toml configs with run_llama_train.sh on 4 A10 24GB GPUs. PyT…

MinghaoYan updated 4 days ago
11
dktc2024/dktc #1

build_model

```python # DistilBERT 토크나이저 로드 tokenizer = DistilBertTokenizer.from_pretrained('monologg/distilkobert') # 데이터를 DistilBERT 입력 형식으로 변환하는 함수 정의 def convert_to_input(df, tokenizer, max_length=400):…

wjdgml0526 updated 1 week ago
2
unslothai/unsloth #638

Can't load CodeLlama-13b

I would like to finetune CodeLlama-13b in a memory efficient way. I was able to do it with CodeLlama-7b, but failing with 13b. I can't load the model `unsloth/codellama-13b-bnb-4bit`: ```pyth…

user799595 updated 1 week ago
5
huggingface/transformers #29129

Flash attention implementation with BERT base model

### Model description hello and thanks community. I am trying to replace standard attention by flash attention in the BERT base Model. Anyone please help not able to find any tutorial or any discu…

ghost updated 1 month ago
8
ridgerchu/matmulfreellm #34

LLVM ERROR: Cannot select: intrinsic %llvm.nvvm.shfl.sync.bf…

When I run with gpt2 models, all its ok! But when I run with anyone those models exists ridger/MMfreeLM-370M, MMfreeLM-1.3B or MMfreeLM-2.7 this error occur.Why? Can anyone help me? Error: LLVM ERR…

jeisonmp updated 3 days ago
1
tensorflow/tensorflow #66721

GPUv2 segfaults on split-head attention CLIP model

**System information** - Google Pixel 7 / Android 13 / Google Tensor G2 - TFLite 2.16.1 (stock) **Standalone code to reproduce the issue** Model asset: [tflite_66721_sha_clip_gpuv2_segfault.t…

gustavla updated 3 weeks ago
3
LargeWorldModel/LWM #55

pytorch model & ring attention

Thanks for sharing this excellent great work. We want to use pytorch models to try the effect of ring attention. Are there any plans to develop ring attention implementation under pytorch?

LzhinFdu updated 4 months ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for attention-model

1000+ results
for attention-model