attention-model Search Results

WenjieDu/SAITS #40

Training stage of an attention based model!

Greetings Wenjie, I was very much impressed aby your work "SAITS". I am trying to create an attention-based model on my own as a part of my Bacholer's project and I have a few questions to ask: I wa…

kalluarjun69 updated 1 week ago

huggingface/transformers #31787

Transformer models are not deterministic when using Flash At…

### System Info - `transformers` version: 4.41.2 - Platform: Linux-6.5.0-27-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.23.4 - Safetensors version: 0.4.…

YunfanZhang42 updated 19 hours ago

ubc-tea/Backdoor_Multimodal_Foundation_Model #1

python scripts/zero_shot.py

Why does this error occur? How do I solve this? $ python zero_shot.py /home/cr/miniconda3/envs/backdoor_Medclip/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The par…

LCR2001 updated 19 hours ago

huggingface/text-generation-inference #2144

Could not import Flash Attention enabled models: cannot impo…

### System Info OS version: WSL 2. ubuntu 22.04 model: llama3-8B-Instruct Hardware: no GPU There is no gpu, but I installed the nvcc library in wsl using this command. `sudo apt install nvidia…

Hhhh8 updated 1 day ago

ContextualAI/gritlm #45

Can I select causal attention for retrieval embeddings when …

In the paper, the ablation study about attention emb and gen is interesting. Are these models all different models using each attention? Can I select causal attention for both cases when using G…

Yangseung updated 1 day ago

atoma-network/atoma-paged-attention #1

Add paged attention kernels for the Llama model architecture

Following the paged attention [paper](https://arxiv.org/pdf/2309.06180), add cuda kernels for the Llama model. Cuda kernels for the Llama architecture have been widely implemented in the open source c…

jorgeantonio21 updated 2 weeks ago

huggingface/transformers #31020

Models with Phi3Config fail due to missing attention_bias

### System Info - `transformers` version: 4.41.1 - Platform: Linux-5.15.0-1055-aws-x86_64-with-glibc2.35 - Python version: 3.10.14 - Huggingface_hub version: 0.23.0 - Safetensors version: 0.4.3 …

manikawnth updated 1 week ago

huggingface/transformers #31741

Can I use "attn_implementation" in model config file

hi, I want to use the examples/pytorch/language-modeling/run_clm.py to train my model. But I find that the only way to use flash_attention is to modify the code in run_clm.py like: ```python …

hanwen-sun updated 12 minutes ago

idiap/fast-transformers #114

Can't officially save Linear Attention model

Tried (ubuntu) to torch.save (1.1.0) model using Linear Attention (0.4.0) and got the following serialization error: `PicklingError: Can't pickle : attribute lookup on fast_transformers.feature_maps…

maulberto3 updated 1 week ago

UKPLab/sentence-transformers #2694

When the model is on the CPU, the device of the tensor retur…

Let's look at the code directly: ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("all-MiniLM-L6-v2").to('cpu').eval() model.encode(['test'], convert…

secsilm updated 1 month ago

1000+ results for attention-model

1000+ results
for attention-model