-
-
when use nerfstudio==1.1.3 and gsplat=1.0.0;
grads = self.xys.absgrad[0].norm(dim=-1) # type: ignore
error:
File "/usr/local/lib/python3.8/dist-packages/nerfstudio/scripts/train.py", line 2…
-
### Description
I'm trying to scale up some transformer training (currently at ~400m params), and as such I've been playing around with various ways to save memory and improve performance. On a whi…
-
Hello, author. I want to add ANDMask for benchmark, Well I met a problem when I run for the LSA64 dataset. Could you please check out if the ANDMask code is right and how to solve in LSA64, while the …
-
用transformer库之前就下载好的LLaMA3_1-8B-Instruct模型,没有使用modelscope下载,执行trainer.train后:
求解答谢谢T T
-
I have a strange issue with backward() I have two generators, gen1 and gen2, I calculate loss on three ways, loss_1, loss_2, loss_3
All compute for gen1 are ok
Part 1.
let out = gen1.forward(inp…
-
we should find a elegant way to baseline after combining grads in TFR.
Otherwise the ERD appears positive on grads...
-
Hi, I encountered this error during training llama-2-7B using the script, any idea to fix it?
Traceback (most recent call last):
File "qst.py", line 942, in
train()
File "qst.py", line …
-
Hi ! congrats on this wonderful job. After reading your paper, I'm really curious about one technique that you use.
In the paper, you said:
> To minimize the interference with the original model…
-
when load the optimizer.pt display the key is different
KeyError: 'base_model.model.model.layers.0.self_attn.q_proj.lora_A.default.weight'
the items in optimizer.pt state is 0~255.