transformer-models Search Results

1000+ results
for transformer-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #34689

Transformers 4.46.2 breaks model loading for Llama 3.2 90B V…

### System Info - `transformers` version: 4.46.2 - Platform: Linux-5.10.226-214.880.amzn2.x86_64-x86_64-with-glibc2.26 - Python version: 3.10.14 - Huggingface_hub version: 0.26.2 - Safetensors ve…

iprivit updated 6 days ago
7
huggingface/peft #1962

Deprecation: Transformers will no longer support `past_key_v…

As reported by @ArthurZucker: > Quick question, I am seeing this in peft: https://github.com/huggingface/peft/blob/f2b6d13f1dbc971c7653aa65e82822ea2d84bb38/src/peft/peft_model.py#L1665 where there …

BenjaminBossan updated 1 month ago
15
AkihikoWatanabe/paper_notes #1399

Sohu, etched, 2024.06

https://www.etched.com/announcing-etched

AkihikoWatanabe updated 1 month ago
1
xlang-ai/instructor-embedding #123

prompt parameters cannot be used?

On website: https://www.sbert.net/docs/sentence_transformer/pretrained_models.html, we can see the function "model.encode" use parameter "prompt". However, I didn't see that "prompt" parameter were me…

Huangouzm updated 3 months ago
1
ROCm/flash-attention #79

[Issue]: is scaled_dot_product_attention part of flash atten…

### Problem Description I get these errors often from [various applications](https://github.com/pytorch/pytorch/issues/134208), this one if from ComfyUI. Is scaled_dot_product_attention part of fl…

unclemusclez updated 2 months ago
21
linkedin/Liger-Kernel #249

ValueError when Loading Qwen2-VL Model with Liger Kernel

### 🐛 Describe the bug I'm encountering a ValueError when trying to load the Qwen2-VL model using the AutoLigerKernelForCausalLM class from the Liger Kernel. The error message indicates an unrecogn…

rahatarinasir updated 2 months ago
1
pytorch/pytorch #90920

Support for Transformer Models on Android with Vulkan Backen…

### 🚀 The feature, motivation and pitch Hello We are currently using a number of different transformer models (plain BERT encoders with attached classification head) on Android. In order to increa…

martin-schilling updated 1 year ago
4
yifanlu0227/HEAL #31

How can I reproduce where2comm in HEAL?

It seems that the code repository’s `hypes_yam`l folder does not provide the configuration file for where2comm.

zymard updated 3 weeks ago
1
NVIDIA/Megatron-LM #1151

[BUG] Context parallel gives NCCL error

**Describe the bug** I am using the `train_gpt3_175b_distributed.sh` script to launch training on a single node with 4 A100 80GB GPUs. Training goes well if I use tensor parallel or pipeline parallel,…

YJHMITWEB updated 1 day ago
1
unslothai/unsloth #888

Cache only has 0 layers, attempted to access layer with inde…

When using: **Mistral 7b Text Completion - Raw Text training full example.ipynb** **Last block errors with:** `Exception in thread Thread-17 (generate): Traceback (most recent call last): File…

arturwplantecs updated 3 months ago
2

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for transformer-models

1000+ results
for transformer-models