-
I want to use backpack for computing per-sample gradients and was trying to understand the challenges of using a custom model that uses pytorch nn layers. For example, something like this architecture…
-
Traceback (most recent call last):
File "my_track.py", line 661, in
main(name=name, launcher=args.launcher, use_wandb=args.wandb, **config)
File "my_track.py", line 519, in main
refer…
-
### Describe the bug
This time i set amount of steps to 2 to make sure it correctly saves the model after an hour of training. But it does not.
### Reproduction
Run `accelerate config`
```
comp…
kopyl updated
2 weeks ago
-
您好!我参考您的代码,将应用于GPT2的Attentioner Manager应用到Llama上,然后得到了saliency分数,每一层都是[1,1,seq_len,seq_len],部分具体数值如下:
我想知道这里每一层的saliency分数的具体含义?
我的代码如下:
```
class LlamaAttentionManager(AttentionerManagerBase):
…
-
Hi,
I am following the article at https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/pytorch-llama/
but at step
```
python torchchat.py export llama3.1 --output-dso-p…
-
Hey, thanks for the great work. I could be wrong, but I feel like there is a disconnect between what is mentioned in the Based paper and what is used in the Figure 2 config for MQAR eval. In the paper…
-
I have two 3 discrepancies between what is described in the paper versus what I see in code/blog posts.
1. The recent publishing of MMD had this figure
![image](https://github.com/facebookresearc…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [ Yes] I am running the latest code. Development is very rapid so there are no tagged versions as…
-
```sh
(ldm) hugo@DESKTOP:/mnt/d/stable-diffusion$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms --ckpt models/ldm/text2img256/model.ckpt
Global seed set to …
-
Dear Team
Thank you so much for releasing the model.. I am trying to integrate the flux model for some use case for which I requires the unet, and image_encoder. I find in the FluxPipeline there exi…