linear-attention-model Search Results

1000+ results
for linear-attention-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/Megatron-LM #991

[BUG] clip key mismatch

**Describe the bug** I try to use LLaVA example and faced to key mismatch error. I am on latest commit in main branch. (094d66b) [rank0]: RuntimeError: Error(s) in loading state_dict for LLaVAMode…

KookHoiKim updated 1 month ago
1
NVIDIA/Megatron-LM #1134

[BUG] 'NoneType' object has no attribute 'shape' error raise…

Hi, It seems that the same code is **working fine with when the Megatron-LM that I git-cloned in April. With the latest Megatron-LM, I've got the following error raised with the pretrain_gpt.py code. …

hwang2006 updated 1 week ago
8
amd/RyzenAI-SW #125

run_awq.py using qwen1.5-7b-chat when quantize error

python run_awq.py --model_name Qwen/Qwen1.5-7B-Chat --task quantize Namespace(model_name='Qwen/Qwen1.5-7B-Chat', target='aie', profile_layer=False, task='quantize', precision='w4abf16', flash_attenti…

Wikeolf updated 14 hours ago
3
NVIDIA/NeMo #10386

megatron.core.dist_checkpointing.core.CheckpointingException…

**Describe the bug** I got MLPerf LLama2 LoRA working with 24.04-py3 pytorch image with the [same modifications](https://github.com/mlcommons/training_results_v4.0/blob/main/NVIDIA/benchmarks/llama2_…

OrenLeung updated 2 days ago
1
NVIDIA/Megatron-LM #937

[BUG]Get an AtrributeError when trying to finetune llama3-8B…

**Describe the bug** I try to finetune `llama3-8B` model with multi nodes but get an AtrributeError when finishing loading mcore format checkpoint and starting to build datasets, the error is below: …

nakroy updated 1 week ago
5
infocusp/varformers #1

Suggestion to Implement Additional Efficient Transformer Var…

# Description: Hello! I appreciate the excellent work on benchmarking Performer and Longformer against the base Transformer. I’d like to propose the implementation of additional efficient Transformer…

rajveer43 updated 3 weeks ago
1
idiap/fast-transformers #114

Can't officially save Linear Attention model

Tried (ubuntu) to torch.save (1.1.0) model using Linear Attention (0.4.0) and got the following serialization error: `PicklingError: Can't pickle : attribute lookup on fast_transformers.feature_maps…

maulberto3 updated 3 months ago
2
EricLBuehler/candle-lora #21

Bert model doesn't seem to instantiate with lora weights

I tried to instantiate a bert model with the following code: ```rust use candle_core::DType; use candle_lora::LoraConfig; use candle_lora_transformers::bert::{BertModel, Config}; use candle_nn::{…

jcrist1 updated 1 month ago
2
t46/fukuro-researcher #22

labels の encoding が適切にできない問題を解消する

```python def generate_tokenize_dataset_func(dataset_sample): prompt = f""" You are a helpful assistant. The dataset is huggingface datasets.Dataset. The first element of the…

t46 updated 1 week ago
1
huggingface/peft #2054

Problem with model.merge_and_unload - the saved model is al…

### System Info Ubuntu 22.04 all latest versions ### Who can help? @BenjaminBossan @sayakpaul ### Information - [ ] The official example scripts - [x] My own modified scripts ### Ta…

Oxi84 updated 2 days ago
4

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for linear-attention-model

1000+ results
for linear-attention-model