torch Search Results - Githubissues

1000+ results
for torch

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

thunlp/InfLLM #55

Error when reproducing mistral results

When I want to reproduce the original model results of mistral-7b-v0.2 without `flash-attn` I got the error: ``` Traceback (most recent call last): File "/home/yuanye/long_llm/InfLLM/benchmark/pr…

yuanyehome updated 3 weeks ago
2
DS4SD/docling #308

Convert model weights to safetensors format

We want to move from pickled objects saved by `torch` or `torch.jit` to safetensors format for the weights of `docling-ibm-models`. This has various advantages, such as better security, and also acts …

cau-git updated 6 days ago
1
pytorch/pytorch #141452

[Dynamo] Keep weak reference of parameters instead of access…

### 🚀 The feature, motivation and pitch For this code: ```python import torch class DummyModule(torch.nn.Module): def __init__(self): super(DummyModule, self).__init__() …

youkaichao updated 5 days ago
6
sail-sg/Agent-Smith #4

ValueError: Expected input batch_size (609) to match target …

When I run the validate.py,I encounter the following error: Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:05

bifenghaiziyou updated 1 week ago
1
zhengzangw/Sequence-Scheduling #4

torch.cuda.OutOfMemoryError: CUDA out of memory.

hi！when I try to running your demo in PiA part, I get an error in 'instruction tuning' step: ``` root@0de6f5c3da0f:/workspace/zt/code/Sequence-Scheduling# bash train.sh [2024-10-02 22:24:40,711] …

Noblezhong updated 1 month ago
2
d-ailin/GDN #98

CUDA version and Torch Version

CUDA and torch version used in this project looks outdated. Can I use CUDA 12.1 instead or does this cause bug?

LameloBally updated 1 month ago
2
siddhanthaldar/BAKU #12

RuntimeError:Size mismatch when Evaluate BAKU on metaworld

Dear author: Thanks a lot for your great contribution to multi task policy learning. Any suggestion on debugging followed issue? When I run followed cmdline, I got runtime error: RuntimeError: …

4evertutelary updated 2 weeks ago
5
Hprairie/Bi-Mamba2 #5

Question about the parameter dt(delta) and its initializatio…

Thanks for your awesome work. Looking through the code of Mamba and Mamba2. I'm really confused about the dimension of the parameter dt. I understand that delta is used to discretize A and B in SSM. …

zzzack66 updated 1 week ago
5
pytorch/pytorch #140706

MPS random number generation is slow, if not hanging forever

### 🐛 Describe the bug The following benchmarks are at least 3x - if not 10x slower on mps than cpu on a recent macbook pro M3 ```python from torch.utils.benchmark import Timer import torch pri…

vmoens updated 1 week ago
7
mobiusml/hqq #130

8bit + Aten + compile

When I try to run patch_model_for_compiled_runtime on 8bit + aten, the program reports an error. How can I solve this problem? ![image](https://github.com/user-attachments/assets/f0a85477-f36e-4081-b…

zhangy659 updated 1 month ago
6

上一页 1...90 91 92 93 94 95 96...100 下一页

1000+ results for torch

1000+ results
for torch