offline-llm Search Results

975 results
for offline-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

intel-analytics/ipex-llm #8905

PySpark support for BigDL LLM int4

Hi, I would like to evaluate the following capabilities of BigDL LLM using PySpark offline CPU jobs: - Generating Embeddings for queries and documents. - Generating text uses prompts and/or chai…

olcayc updated 1 year ago
2
intel-analytics/ipex-llm #10924

Performance drop for neural-chat 7b with new repo of ipex-ll…

We have seen a significant difference in performance drop with the env created with the latest repo for vllm serving for the neural-chat model as compared to the old env built with the old repo. With …

Vasud-ha updated 4 months ago
22
flexflow/FlexFlow #1377

Performance Issue

Hi, we have tried to run the speculative inference process on OPT-13B and Llama2-70B-chat, but meet some issues. Specifically, for Llama2-70B-chat , we obtained performance worse than vLLM, which seem…

lethean287 updated 1 month ago
1
LibrePhotos/librephotos #1368

[Feature request] Deploy without Internet

The current version requires an Internet connection to download the models when it is first used after deployment. Will the future add ways to deploy without the Internet? This will make LibrePhoto…

JueLuo99 updated 4 weeks ago
1
OlofHarrysson/iths-data-engineering-group-alexnet #77

Pre-Trained Local LLM

Use a pre-trained summary model to create summaries "offline" on your computer, eliminating API costs. However, researching, implementing, and getting this to work could be challenging. There are vari…

wlinds updated 1 year ago
2
vllm-project/vllm #7301

[Usage]: Acceptance rate for Speculative Decoding

I have been running the scripts from [https://docs.vllm.ai/en/latest/models/spec_decode.html](https://docs.vllm.ai/en/latest/models/spec_decode.html ) on how to do speculative decoding with vLLM. H…

itsdaniele updated 3 weeks ago
9
vllm-project/vllm #3201

Does VLLM currently support QWEN LoRa model ？

I use the multi-LoRA for offline inference: sql_lora_path = "/home/zyn/models/slot_lora_gd" from vllm import LLM, SamplingParams from vllm.lora.request import LoRARequest llm = LLM(model="/ho…

qingjiaozyn updated 6 months ago
5
InAnYan/jabref #87

Enable offline use

When JabRef with AI PR is run first, `djl` will download files in background in order to work with embedding models (I guess the PyTorch backend and embedding model). JabRef already has some issues…

InAnYan updated 1 month ago
10
huggingface/llm-ls #56

[Suggestion] Metrics support

First of all, amazing project! We've started experimenting with the project on an on-premise offline environment, so far it works great! We need our extensions to send metrics and events to a centra…

DanielAdari updated 7 months ago
3
vllm-project/vllm #7059

[Bug]: The Qwen2 model gives no response when the tensor_par…

### Your current environment ```text PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Alibaba Group Enterprise Linux Serv…

efficentdet updated 1 month ago
1

上一页 1...6 7 8 9 10 11 12...98 下一页

975 results for offline-llm

975 results
for offline-llm