-
Is there any plan to restructure the code to be uniform to use it with Llama2/API like (gpt-3.5-turbo, gpt-4) to use this PDF-to-text in any hardware.
https://github.com/Dicklesworthstone/llama2_ai…
-
llama.cpp: loading model from models\llama-2-7b-chat.ggmlv3.q8_0.bin
error loading model: unknown (magic, version) combination: 67676a74, 00000003; is this really a GGML file?
llama_init_from_file…
-
Feature request
Nous Research and EleutherAI have released the YaRN model, which comes in two versions with context sizes of 64k and 128k. This model utilizes RoFormer-style embeddings, distinguishin…
-
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:42
-
When Megatron-DeepSpeed support llama3/llama3.1 pretraining?
-
Hello. Can you please tell me which evolutionary search hyperparameters (population_size, mutation_numbers, crossover_size, etc.) you used to 8x increase the context length of the Mistral v0.1 or LLaM…
-
## Environment
- RTX8000 GPU
- TensorRT-LLM v0.9.0
## Model
- LLaVA v1.5 7B (LLaMA2 7B)
- fp16 and int8/int4 weight quantization
- batchsize = 16
## Script
- official `examples/multimodal/run.…
-
Hello! Did anyone meet the following bug when using zero_stage3 for Lllama2?
step3_rlhf_finetuning/rlhf_engine.py:61 in __init__ │
│ …
-
# 1. Ollama
## 1. use Ollama CLI:
```
ollama serve
ollama run llama2:7b, llama3, llama3:70b, mistral, dophin-phi, phi, neural-chat, codellama, llama2:13b, llama2:70b
ollama list
ollama show
…
-
## 🐛 Bug
## To Reproduce
Steps to reproduce the behavior:
I followed [https://captum.ai/tutorials/Llama2_LLM_Attribution](url)
My code is here,the only difference is I changed the model_…