pytorch-transformer Search Results

1000+ results
for pytorch-transformer

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #30125

LayoutLM.from_pretrained doesn't load embeddings' weights wh…

### System Info - `transformers` version: 4.38.1 - Platform: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.31 - Python version: 3.10.13 - Huggingface_hub version: 0.20.3 - Safeten…

mszulc913 updated 17 hours ago
6
huggingface/huggingface-llama-recipes #37

Llama-Vision FT Error: The number of images in each batch [1…

Dear all, Thank you so much for sharing the llama3.2 vision model fine-tuning script so fast! I got the following error when running the demo ``` The model weights are not tied. Please use t…

JunMa11 updated 2 weeks ago
3
huggingface/exporters #82

Example of distilbert-base-uncased fails validation

I ran the first command provided (to do some sanity checking of my setup since I usually get very high output errors for larger models like LLMs) and I get an output validation error. I've made sur…

Proryanator updated 1 month ago
2
OpenBMB/vllm #14

[Installation]: 安装报告找不到numpy，实际nump已经安装好了

### Your current environment PyTorch version: 2.4.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (U…

qq745639151 updated 3 weeks ago
1
pytorch/TensorRT #3135

🐛 [Bug] Encountered bug when using Torch-TensorRT

## Bug Description https://github.com/pytorch/TensorRT/blob/main/examples/dynamo/mutable_torchtrt_module_example.py I replaced hugging face whisper model instead of diffusion model ## To Repr…

Jeevi10 updated 1 month ago
4
ROCm/pytorch #1390

Add support for memory efficient attention for AMD/ROCm

### 🚀 The feature, motivation and pitch Enable support for Flash Attention Memory Efficient and SDPA kernels for AMD GPUs. At present using these gives below warning with latest nightlies (torch==…

Looong01 updated 5 months ago
1
horovod/horovod #1419

multi cpu training is worse than single cpu

**Environment:** 1. Framework: (TensorFlow, Keras, PyTorch, MXNet) Pytorch 2. Framework version: latest code from huggingface: https://github.com/huggingface/pytorch-transformers 3. Horovo…

mariamrahmaani updated 4 years ago
7
GCYZSL/MoLA #21

attention_mask 为全 1 时 causal_mask 为 None，导致训练出现问题

## 问题原因在加载模型时，from_pretrained 方法会调用 transformers/modeling_utils.py 中的 _autoset_attn_implementation 方法自动开启 sdpa (新版 pytorch 中 Scaled Dot-Product Attention 的高效实现)： ![_autoset_attn_implementation](htt…

zyc140345 updated 2 months ago
3
NVIDIA/apex #1453

potential improvement in p2p communication

https://github.com/NVIDIA/apex/blob/a0f5f3ac0f6bf39feee6e60eee66ec873dc299ab/apex/transformer/pipeline_parallel/p2p_communication.py#L271 might be able to be removed after confirming https://github.co…

crcrpar updated 2 years ago
1
lucidrains/DALLE-pytorch #261

generate + fp16 (deepspeed)

issue about the fact generate is not possible with fp16 (deepspeed) introduced when fp16 feature was introduced https://github.com/lucidrains/DALLE-pytorch/pull/157 : ``` /pytorch/aten/src/THC/THC…

rom1504 updated 3 years ago
1

上一页 1...71 72 73 74 75 76 77...100 下一页

1000+ results for pytorch-transformer

1000+ results
for pytorch-transformer