-
**Describe the bug**
I try to use LLaVA example and faced to key mismatch error. I am on latest commit in main branch. (094d66b)
[rank0]: RuntimeError: Error(s) in loading state_dict for LLaVAMode…
-
Hi, It seems that the same code is **working fine with when the Megatron-LM that I git-cloned in April. With the latest Megatron-LM, I've got the following error raised with the pretrain_gpt.py code. …
-
python run_awq.py --model_name Qwen/Qwen1.5-7B-Chat --task quantize
Namespace(model_name='Qwen/Qwen1.5-7B-Chat', target='aie', profile_layer=False, task='quantize', precision='w4abf16', flash_attenti…
-
**Describe the bug**
I got MLPerf LLama2 LoRA working with 24.04-py3 pytorch image with the [same modifications](https://github.com/mlcommons/training_results_v4.0/blob/main/NVIDIA/benchmarks/llama2_…
-
**Describe the bug**
I try to finetune `llama3-8B` model with multi nodes but get an AtrributeError when finishing loading mcore format checkpoint and starting to build datasets, the error is below:
…
-
# Description:
Hello! I appreciate the excellent work on benchmarking Performer and Longformer against the base Transformer. I’d like to propose the implementation of additional efficient Transformer…
-
Tried (ubuntu) to torch.save (1.1.0) model using Linear Attention (0.4.0) and got the following serialization error:
`PicklingError: Can't pickle : attribute lookup on fast_transformers.feature_maps…
-
I tried to instantiate a bert model with the following code:
```rust
use candle_core::DType;
use candle_lora::LoraConfig;
use candle_lora_transformers::bert::{BertModel, Config};
use candle_nn::{…
-
```python
def generate_tokenize_dataset_func(dataset_sample):
prompt = f"""
You are a helpful assistant.
The dataset is huggingface datasets.Dataset.
The first element of the…
-
### System Info
Ubuntu 22.04 all latest versions
### Who can help?
@BenjaminBossan @sayakpaul
### Information
- [ ] The official example scripts
- [x] My own modified scripts
### Ta…