llama2 Search Results - Githubissues

1000+ results
for llama2

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel-analytics/ipex-llm #9484

[pvc 1550] got blank output for llama2 70B on pvc with int4

the script I use is https://github.com/intel-analytics/BigDL/blob/main/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama2/generate.py with model Llama-2-70b-hf , the output sometimes is e…

ZhaoqiongZ updated 7 months ago
6
huggingface/blog #1363

llama2 blog post prompt example contains extra space and new…

The system prompt in the [llama2 blog post](https://huggingface.co/blog/llama2) contains an extra space and new line when compared to the [original](https://github.com/facebookresearch/llama/blob/6c7f…

viniciusarruda updated 11 months ago
2
young-geng/EasyLM #86

Questions about inference mode

Hi, I'm trying to use your wonderful framework to do inference only. However, I'm not familiar with serving-related settings in your code. How to remove them? or change a bit of code? By the way, …

Tengfei09 updated 11 months ago
2
LlamaFamily/Llama-Chinese #112

在线体验回答变英文

Llama2-Chinese-13b-chat在线体验回答都变英文了 ![image](https://github.com/FlagAlpha/Llama2-Chinese/assets/15713149/682a7e05-6f02-4af2-961f-ced734a402f7)

layjoy updated 10 months ago
1
ggerganov/llama.cpp #7608

Question: Why do GPU and CPU embedding outputs differ for th…

### Prerequisites - [X] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed). - [X] I reviewed the [Discussions](https://git…

jygmysoul updated 2 weeks ago
1
HuskyInSalt/CRAG #12

considering running CRAG on Mac M1 series computers?

Because I heard that CUDA is not actually needed when CRAG is actually running. Is that so? > there is no NIVIDA CUDA on the Mac Apple Silicon series computers. errors: ``` Preparing metadat…

shuiRong updated 2 months ago
1
mlc-ai/mlc-llm #2575

[Question] multiple gpu seting: Check failed num_running_rse…

## setting server command: mlc_llm serve mlc-llama2-7b-q4 --overrides "tensor_parallel_shards=2" --mode server requst: request rate is 20 request/s gpu: a40 ## ❓ General Questions 10 request/…

aaronlyt updated 3 days ago
1
neural-maze/crewai_linkedin_post #1

Web Searcher agent not working

When I run the python mainly the first agent works fine but when its time for the next agent do its task i come up with this error python3 main.py …

davtin19 updated 1 week ago
1
vllm-project/vllm #2019

why online seving slower than offline serving??

1. offline serving ![image](https://github.com/vllm-project/vllm/assets/43260218/87e216b5-9064-4c2a-a021-cac08e22795d) 2. online serving(fastapi) ![image](https://github.com/vllm-project/vllm/ass…

BangDaeng updated 3 weeks ago
8
soulteary/docker-llama2-chat #7

Apple M系列 docker 运行出错

参考 https://soulteary.com/2023/07/23/build-llama2-chinese-large-model-that-can-run-on-cpu.html 使用 Apple M2, 用最后的 docker `soulteary/llama2:runtime` 运行 `Chinese-Llama-2-7b-ggml-q4.bin` ```bash main:…

lslzl3000 updated 10 months ago
4

上一页 1...87 88 89 90 91 92 93...100 下一页

1000+ results for llama2

1000+ results
for llama2