-
In 05_dataloader.ipynb in examples folder https://github.com/NVIDIA/GenerativeAIExamples/blob/v0.4.0/notebooks/05_dataloader.ipynb
is there a way to use Llama2 7b model instead of default Llama2 13b…
-
## Description
It seems that `ttnn.reshape` is inconsistent when being performed on single device vs multi-device. Specifically, the inconsistency can be seen when doing a `[W, X, Y*Z]` -> `[W, X, Y,…
-
Sysinfo
```
python=3.9
```
import error when use `scripts/convert_hf_to_megatron.sh`
```
export MEGATRON=/app/Megatron-LM
export CHATLEARN=/app/ChatLearn
cd ${CHATLEARN}/examples/megatron/…
-
Hi, thanks for your awesome work, I notice some problem about performance of gsm8k when reproducing:
First, I check that PISSA and LORAGA both use MetaMathQA dataset and eval on gsm8k,
In PISSA …
-
Problem: a new user who doesn't have enough memory for llama2 may get mysterious crashes without error messages.
Specifically, my entire system froze and I had to REISUB when I ran each of the foll…
-
Hello, I'm opening this issue because I'm still having problems with reproducing the llama 2-7b results (both without pruning and using wanda). Here are my intermediate and final perplexity results wi…
-
when I run the command:
```
! python3 -m olive.workflows.run --config config_gpu.json
```
I'm getting the issue:
```
/home/z004x2xz/WorkAssignedByMatt/Olive/venvOlive3.11/lib/python3.11/site-pac…
-
During the computation of cos/sin in [llama_rope#L119](https://github.com/tenstorrent/tt-metal/blob/skhorasgani/vllm_llama32_mm/models/demos/t3000/llama2_70b/tt/llama_rope.py#L119), when batch size is…
-
### System Info
Hi,
I'm having trouble reproducing NVidia claimed numbers in the table here: https://nvidia.github.io/TensorRT-LLM/performance/perf-overview.html#throughput-measurements
System Im…
-
Llama.dll that can be downloaded from llama.cpp repo is suitable mostly for programming languages that have abilities to work with rather difficult (for novice coder) concepts like pointers, structure…