grads Search Results - Githubissues

1000+ results
for grads

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

CodeYourFuture/CYF-Staff-Tickets #323

contact scope about help with Grads

haldenb updated 10 months ago
1
nerfstudio-project/gsplat #374

'Tensor' object has no attribute 'absgrad'

when use nerfstudio==1.1.3 and gsplat=1.0.0; grads = self.xys.absgrad[0].norm(dim=-1) # type: ignore error: File "/usr/local/lib/python3.8/dist-packages/nerfstudio/scripts/train.py", line 2…

pjj123 updated 4 weeks ago
5
jax-ml/jax #19657

shard_map *much* faster than pjit for simple data parallelis…

### Description I'm trying to scale up some transformer training (currently at ~400m params), and as such I've been playing around with various ways to save memory and improve performance. On a whi…

kvablack updated 1 week ago
9
jc-audet/WOODS #13

Update ANDMask Problem

Hello, author. I want to add ANDMask for benchmark, Well I met a problem when I run for the LSA64 dataset. Could you please check out if the ANDMask code is right and how to solve in LSA64, while the …

khan-yin updated 5 months ago
1
datawhalechina/self-llm #229

Llama3 Lora报错：RuntimeError: element 0 of tensors does not re…

用transformer库之前就下载好的LLaMA3_1-8B-Instruct模型，没有使用modelscope下载，执行trainer.train后：求解答谢谢T T

AndyLuo1029 updated 2 days ago
1
tracel-ai/burn #2077

bug in backward() or help needed

I have a strange issue with backward() I have two generators, gen1 and gen2, I calculate loss on three ways, loss_1, loss_2, loss_3 All compute for gen1 are ok Part 1. let out = gen1.forward(inp…

Apogeum12 updated 1 month ago
5
mne-tools/mne-python #1919

baseline after combining grads in TFR

we should find a elegant way to baseline after combining grads in TFR. Otherwise the ERD appears positive on grads...

agramfort updated 9 years ago
2
YouAreSpecialToMe/QST #2

ERROR during training

Hi, I encountered this error during training llama-2-7B using the script, any idea to fix it? Traceback (most recent call last): File "qst.py", line 942, in train() File "qst.py", line …

syj2908 updated 2 weeks ago
2
hrz2000/FreeEdit #3

How you do zero-initialize and how good this technique is?

Hi ! congrats on this wonderful job. After reading your paper, I'm really curious about one technique that you use. In the paper, you said: > To minimize the interference with the original model…

lucasgblu updated 3 days ago
4
princeton-nlp/LESS #4

step 2 when run "/get_train_lora_grads.sh", load the optimi…

when load the optimizer.pt display the key is different KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight' the items in optimizer.pt state is 0~255.

victorjiax updated 1 day ago
19

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for grads

1000+ results
for grads