pytorch-transformers Search Results

1000+ results
for pytorch-transformers

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

pytorch/pytorch #139424

`torch.compile` error on `scaled_dot_product_attention` in `…

### 🐛 Describe the bug Related: * https://github.com/pytorch/pytorch/issues/124289 * https://github.com/pytorch/pytorch/issues/109607 ```python """Demonstrate torch.compile error on transform…

ringohoffman updated 2 weeks ago
10
huggingface/transformers #34754

warmup LR schedulers start from LR=0

### System Info transformers commit: 52ea4aa589324bae43dfb1b6db70335da7b68654 (main at time of writing) the rest isn't relevant. ### Who can help? trainer: @muellerzr @SunMarc ### Informa…

cfhammill updated 5 days ago
2
huggingface/transformers #34527

[Feature] Will there be any integration of using Flex-attent…

### Feature request Using (https://pytorch.org/blog/flexattention/) Flex-attention (and [Paged attention](https://github.com/pytorch/pytorch/pull/121845/files)) to speedup transformers models and p…

jianan-gu updated 5 days ago
5
linkedin/Liger-Kernel #315

mllama patch modifies nn.LayerNorm globally

### 🐛 Describe the bug Instead of only patching the transformers mllama module (`transformers.models.mllama.modeling_mllama`), `apply_liger_kernel_to_mllama` modifies `torch.nn.LayerNorm` globally. …

tyler-romero updated 1 month ago
3
vllm-project/vllm #10534

[Usage]: Fail to load params.json

### Your current environment ```text $ python collect_env.py Collecting environment information... PyTorch version: 2.5.1 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to bu…

dequeueing updated 3 days ago
3
modelscope/ms-swift #2364

lora微调占用显存**逐渐增大**直到**爆炸**

## lora微调qwen2.5-7b逐渐爆显存 - 版本 torch 2.4.0 transformers 4.46.1 ms-swift 2.5.1.post1 - 报错 torch.OutOfMemoryError: CU…

LixiangHello updated 3 weeks ago
2
kohya-ss/sd-scripts #1720

Enabling dim_from_weights or loraplus_unet_lr_ratio will cau…

Hi, Today, when I was running LoRA training for the `Flux.1` model (sd-scripts on SD3's breach), the "`train_blocks must be single for split mode`" error suddenly occurred. This error had not appea…

avan06 updated 3 days ago
3
neo4j-labs/llm-graph-builder #845

Backend Installation Slow Due to Heavy Dependencies Like PyT…

Hi, On DEV branch, deploying the app locally take a lot of time in particular the backend. Requirement.txt file from the backend contains packages (sentence-transformers, effdet) that have heavy d…

Sinnaeve updated 2 weeks ago
1
aws-neuron/aws-neuron-sdk #1022

vllm offline inference

**server:** inf2.8xlarge **vllm version**: 0.6.3.post2.dev77+g2394962d.neuron215 _Desctiption_ Hellow! I am trying to run the code below (the code was taken [here](https://docs.vllm.ai/en/v0.4.1/…

Shnumshnub updated 2 weeks ago
4
meta-llama/llama-models #159

Error no file named pytorch_model.bin, model.safetensors

Hello, I successfully downloaded the model to this directory /root/.llama/checkpoints/Llama3.2-1B-Instruct When I launch the AutoModelForCausalLM.from_pretrained passing the path above I got the f…

morbidod updated 5 days ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for pytorch-transformers

1000+ results
for pytorch-transformers