tgi Search Results - Githubissues

1000+ results
for tgi

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

rayantony/telepresence #15

Help regard AS mode of Presence Server

``` hi every one , i m IFFI , to try implement AS mode of Presence . mean behind asterisk, i have successfully configure and installed Presence , but i dnt understand how to configure this one with…

GoogleCodeExporter updated 9 years ago
1
dvoina/telepresence #15

Help regard AS mode of Presence Server

``` hi every one , i m IFFI , to try implement AS mode of Presence . mean behind asterisk, i have successfully configure and installed Presence , but i dnt understand how to configure this one with…

GoogleCodeExporter updated 9 years ago
1
suminhbk/telepresence #15

Help regard AS mode of Presence Server

``` hi every one , i m IFFI , to try implement AS mode of Presence . mean behind asterisk, i have successfully configure and installed Presence , but i dnt understand how to configure this one with…

GoogleCodeExporter updated 8 years ago
1
tankslappa/telepresence #15

Help regard AS mode of Presence Server

``` hi every one , i m IFFI , to try implement AS mode of Presence . mean behind asterisk, i have successfully configure and installed Presence , but i dnt understand how to configure this one with…

GoogleCodeExporter updated 9 years ago
1
csharpworker/telepresence #15

Help regard AS mode of Presence Server

``` hi every one , i m IFFI , to try implement AS mode of Presence . mean behind asterisk, i have successfully configure and installed Presence , but i dnt understand how to configure this one with…

GoogleCodeExporter updated 8 years ago
1
casper-hansen/AutoAWQ #446

Compare the inference speed of quantized model and unquantiz…

I have tested the inference speed of quantized model and unquantized model, which is first finetuned by my own dataset. I used **AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True, max_seq…

tu2022 updated 6 months ago
4
meta-llama/llama #579

Llama2 Taking Too Long to Generate Text, How can I make it u…

Here is the method I am referring to in my code: def generate_story(scenario): tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf") model = AutoModelForCausalLM.fr…

MinervaArgus updated 1 year ago
1
opea-project/GenAIInfra #487

ChatQnA queries return just Gaudi TGI errors when rerank is …

Installing `-f chatqna/gaudi-values.yaml` git HEAD setup with Helm, and then querying ChatQnA: ``` curl http://${host_ip}:8888/v1/chatqna \ -H "Content-Type: application/json" \ -d '{ …

eero-t updated 5 days ago
3
aws/sagemaker-huggingface-inference-toolkit #106

No support for multi-GPU

It seems that it's not possible to run models using multiple gpus, e.g. by passing `device_map="auto"` to pipelines. Is there any way to work around this limitation?

parviste-fortum updated 4 months ago
3
huggingface/text-generation-inference #2363

Build Intel CPU optimized image automatically

Hello, we are looking for the best way for deploying TGI on Xeons. I understand that container images tagged with `x.y.z-intel` are the XPU builds, while `Dockerfile_intel` defines both XPU and CP…

Feelas updated 3 months ago
1

上一页 1...27 28 29 30 31 32 33...100 下一页

1000+ results for tgi

1000+ results
for tgi