-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Current Behavior
ChatGLMTokenizer(name_or_path='THUDM/chatglm-6b', vocab_size=64794, model_max_length=1000…
-
The dropout voltage of 78l05 is 1.7v, so I'm curious will the circuit work when input voltage is also 5v from usb?
The input is 5v and 78l05's output is also 5v. From my understanding, the LDO won'…
-
I have a CNN model. I used the hls4ml and all file and bitfile generated completely. Now I used the deployment code to implement on FPGA(ZCU104), the prediction output of FPGA is always Zero.
**Tot…
-
Select input model, base model (analog madness v7) in this case.
It works at 512x512 on cpu/gpu, larger throws:
INFO UNet2DConditionModel: 64, 8, 768, False, False …
-
It's easy to overfit, so add some dropout layer could solved this problem?
-
### 🚀 The feature, motivation and pitch
Hello!
I am working on information theory application for neural networks [(here)](https://openreview.net/forum?id=bQB6qozaBw).
With my research I show tha…
-
When I try to train a stripedhyena model I keep getting issues with the stripedhyena modules seemingly trying to import modules from Flash Attention in an outdated way.
example:
AttributeError: mod…
oxPJ updated
3 months ago
-
modeling_qwen2_vl.py", line 350, in forward
attn_output = F.scaled_dot_product_attention(q, k, v, attention_mask, dropout_p=0.0)
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 122…
-
import logging
import os
import json
import torch
from datasets import load_from_disk
from transformers import TrainingArguments
from trl import SFTTrainer
from unsloth import FastLanguageModel…
-
Good afternoon everyone,
I trained the TVA GAN model for 200 epochs using the same parameters and using 1072 images for train (trainA -> thermal and trainB ->visual) and 460 images for validation fr…