linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

abetlen/llama-cpp-python #1785

llama-cpp-python 0.3.1 didn't use GPU(

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [X] I am running the latest code. Development is very rapid so there are no tagged versions as of…

blademoon updated 13 hours ago
3
intel/intel-extension-for-pytorch #707

DataParallel is Supported for XPU?

### Describe the issue I am facing error's with DataParallel.

yash3056 updated 1 week ago
4
aai-institute/continuiti #40

Linear Attention

# Description Current challenges in using Neural Operators are: irregular meshes, multiple inputs, multiple inputs on different meshes, or multi-scale problems. [1] The Attention mechanism is promi…

JakobEliasWagner updated 6 months ago
1
Tele-AI/Telechat #71

星辰支持AutoModelForSequenceClassification任务相关问题

尝试在星辰开源代码库中的modeling_telechat添加TelechatForSequenceClassification方法类（分别参照qwen和星辰自己代码），会分别出现无法加载模型的错误和训练损失不下降的情况。需要AI公司帮忙一起看看怎么支持AutoModelForSequenceClassification任务。 class TelechatForSequenceClassif…

tcoln updated 2 weeks ago
3
EricLBuehler/candle-lora #14

In Llama model, only the embedding layer is converted to lor…

I tried to fine tune [TinyLlama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) with this crate. After training, the safetensors saved only contains two tensors: ``` lora_llama.b0 lora_…

Adamska1008 updated 1 month ago
5
lllyasviel/Fooocus #3624

[Bug]: OutOfMemoryError: CUDA out of memory.

### Checklist - [X] The issue has not been resolved by following the [troubleshooting guide](https://github.com/lllyasviel/Fooocus/blob/main/troubleshoot.md) - [ ] The issue exists on a clean install…

soniya-dhoke updated 2 weeks ago
1
huggingface/transformers #29466

BASED

### Model description BASED is an attention model which combines sliding window attention and global linear attention to capture similar dependencies to transformers in a subquadratic model. It …

axelmagn updated 6 months ago
1
pytorch/torchtitan #421

LoRA fine-tuning weights explosion in FSDP training

Dear authors, I encountered weights explosion problems during integrating LoRA to torchtitan. I am running with train_configs/llama3_8b.toml configs with run_llama_train.sh on 4 A10 24GB GPUs. PyT…

MinghaoYan updated 3 months ago
12
InternLM/InternLM-XComposer #337

QLora fine tuning?

Hello, thank you for the amazing work, is it possible to use Qlora to fine tune the 4bit quant models?

pbarker updated 2 months ago
23
ltkong218/SAFNet #5

Black image result

Hello, congratulations for your awesome work! When I am trying to insert my own tif files I am getting a black image as result. Am I missing something? Thank you for your effort, and I look forwar…

Liampour updated 1 day ago
3

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model