-
Is this a known issue?
This is the output:
```shell
Microsoft Windows 10 Pro 10.0.19045 with NaNGB and AMD Ryzen 9 5950X 16-Core Processor with 32 cores
GPU Info:
NVIDIA NVIDIA GeForce RTX 30…
-
Llama2 is available in replicate; you can even [fine tune your own version](https://replicate.com/blog/fine-tune-llama-2) there…
Streaming support is the killer feature to make LLMs come alive in …
-
Ref https://huggingface.co/papers/2310.11453
-
This is extension of the main Turbine refactoring work: https://github.com/nod-ai/SHARK/issues/1931. To enable future performance-related work, we should recreate the 1.0 benchmarking mode from `vicun…
kuhar updated
9 months ago
-
Is there any reason why we have an [accuracy upper limit for LLAMA2 Tokens per sample](https://github.com/mlcommons/inference/blob/master/tools/submission/submission_checker.py#L109) but not for GPT-J…
-
Hi @HuskyInSalt, I saw CRAG very interestingly, and would like to introduce to my lab.
However, I have some questions about the experiment.
1. I can see difference between Table1 and Table2, even …
-
-
### Your current environment
PyTorch version: 2.2.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (U…
-
$ python3 -m convert_llama_ckpt --base-model-path /llama2-7b-hf/ --pax-model-path pax_7B/ --model-size 7b
Loading the base model from /llama2-7b-hf/
Traceback (most recent call last):
File "/opt…
-
I installed tensorrtllm_backend in the follow way:
1. `docker pull nvcr.io/nvidia/tritonserver:23.12-trtllm-python-py3`
2. `docker run -v /data2/share/:/data/ -v /mnt/sdb/benchmark/xiangrui:/root…
xxyux updated
1 month ago