linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

f-dangel/backpack #326

BackPACK with simple attention and additional layers

I want to use backpack for computing per-sample gradients and was trying to understand the challenges of using a custom model that uses pytorch nn layers. For example, something like this architecture…

nhianK updated 5 months ago
1
guoqincode/Open-AnimateAnyone #107

mat1 and mat2 shapes cannot be multiplied (4x768 and 1280x32…

Traceback (most recent call last): File "my_track.py", line 661, in main(name=name, launcher=args.launcher, use_wandb=args.wandb, **config) File "my_track.py", line 519, in main refer…

xiaohutongxue-sunny updated 2 weeks ago
2
huggingface/diffusers #9501

Dreambooth Flux training does not save a model for around 10…

### Describe the bug This time i set amount of steps to 2 to make sure it correctly saves the model after an hour of training. But it does not. ### Reproduction Run `accelerate config` ``` comp…

kopyl updated 2 weeks ago
5
lancopku/label-words-are-anchors #26

求解释Saliency分数得到的tensor

您好！我参考您的代码，将应用于GPT2的Attentioner Manager应用到Llama上，然后得到了saliency分数，每一层都是[1,1,seq_len,seq_len]，部分具体数值如下：我想知道这里每一层的saliency分数的具体含义？我的代码如下： ``` class LlamaAttentionManager(AttentionerManagerBase): …

Patrick-Ni updated 2 months ago
1
ArmDeveloperEcosystem/PyTorch-arm-patches #1

AttributeError: '_OpNamespace' 'aten' object has no attribut…

Hi, I am following the article at https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/pytorch-llama/ but at step ``` python torchchat.py export llama3.1 --output-dso-p…

martin-g updated 6 days ago
4
HazyResearch/zoology #29

Question about MQAR eval of Based

Hey, thanks for the great work. I could be wrong, but I feel like there is a disconnect between what is mentioned in the Based paper and what is used in the Figure 2 config for MQAR eval. In the paper…

Hprairie updated 1 week ago
1
facebookresearch/audiocraft #468

Ambiguity in MusicGen architecture

I have two 3 discrepancies between what is described in the paper versus what I see in code/blog posts. 1. The recent publishing of MMD had this figure ![image](https://github.com/facebookresearc…

rtavasso1 updated 1 month ago
2
abetlen/llama-cpp-python #1780

llama-cpp-python not using GPU on google colab

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ Yes] I am running the latest code. Development is very rapid so there are no tagged versions as…

AnirudhJM24 updated 6 days ago
1
CompVis/stable-diffusion #21

txt2img.py always get killed

```sh (ldm) hugo@DESKTOP:/mnt/d/stable-diffusion$ python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms --ckpt models/ldm/text2img256/model.ckpt Global seed set to …

HugoVG updated 1 month ago
23
black-forest-labs/flux #146

unet and image_encoder in FluxPipeline

Dear Team Thank you so much for releasing the model.. I am trying to integrate the flux model for some use case for which I requires the unet, and image_encoder. I find in the FluxPipeline there exi…

arushijain45 updated 1 week ago
1

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model