linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ggerganov/llama.cpp #9664

Bug: Termux adreno 618 vulkan support

### What happened? u0_a227@localhost ~> ./llama.cpp/build/bin/llama-cli -m llama.cpp/models/Qwen2.5-0.5B-Instruct-Q4_K_M.gguf -p "You are a helpful assistant" -cnv -ngl 99 -t 8 -b 64 -tb 8 --ctx-size…

akac97 updated 1 week ago
1
exo-explore/exo #300

Will exo support public internet access and service provider…

It's not an issue, it's something from my brainstorming. I am a Chinese student pursuing a PhD in Korea. About 6 months ago, I had a similar idea just like this repo, trying to make something that …

dengbuqi updated 3 days ago
2
chflame163/ComfyUI_CatVTON_Wrapper #23

[Feature Request] Manually choosing the inpainting unet

Hi! At first, this is an awesome custom node!! Love it! Is there a way or could you implement an option to manually choose a different inpainting unet file? If i just replace the diffusion_pytorch…

MoonMoon82 updated 4 days ago
4
hiyouga/LLaMA-Factory #5425

310P 微调报错 RuntimeError: call aclnnCast failed, detail:EZ9999…

### Reminder - [X] I have read the README and searched the existing issues. ### System Info - `llamafactory` version: 0.8.4.dev0 - Platform: Linux-5.4.0-26-generic-aarch64-with-glibc2.31 - Python…

Tao-begd updated 3 weeks ago
2
NVlabs/RADIO #81

Mapping from spatial features to summary feature

Hello, I was wondering if their is a way to map the spatial features (or a crop of it) to the summary feature? I am seeing that the released 2.5 models use CLS tokens for the summary as opposed to …

OasisArtisan updated 1 month ago
3
aqlaboratory/openfold #259

ModuleNotFoundError: No module named 'attn_core_inplace_cuda…

No module named `attn_core_inplace_cuda` was found during inference on A100. Preinstalled CUDA 11.6 outside of conda. ``` Traceback (most recent call last): File "run_pretrained_openfold.py", …

SimonKitSangChu updated 4 months ago
5
pytorch/pytorch #120189

Making Mamba first-class citizen in PyTorch

### 🚀 The feature, motivation and pitch [Mamba](https://arxiv.org/pdf/2312.00752.pdf) is a new SSM (State Space Model) which is developed to address Transformers’ computational inefficiency on long…

yanboliang updated 2 months ago
4
Dao-AILab/flash-attention #475

comparing HF vs FA2 llama2 models

hi, i'm looking over the optimizations in the trainer here, and trying to port them to the `transformers.trainer.Trainer` for use with llama2 i put together this simple script to view the differenc…

tmm1 updated 2 months ago
26
vdogmcgee/SimCSE-Chinese-Pytorch #12

继续训练效果

您好，关于之前预测时不能得到标签的问题，除了取阈值，我采取的方式为，先用在snli上有监督simcse训练得到checkpoint，计算spearman系数和您表格中的效果差不多。然后在simcse上再加上一层mlp在snli上微调，类似于如下形式： ``` class SimCSE_with_mlp(nn.Module): def __init__(self, SimCSE_mod…

prettyprettyboy updated 12 months ago
2
microsoft/MMdnn #877

problem with converting custom Pytorch model to TensorFlow

Platform (like ubuntu 16.04/win10): Ubuntu 16.04.5 LTS Python version: Python 3.7 Source framework with version (like Tensorflow 1.4.1 with GPU): PyTorch 1.6.0 Destination framework with vers…

nestyme updated 4 years ago
1

上一页 1...15 16 17 18 19 20 21...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model