-
```
hi every one ,
i m IFFI , to try implement AS mode of Presence . mean behind asterisk,
i have successfully configure and installed Presence , but i dnt understand how
to configure this one with…
-
```
hi every one ,
i m IFFI , to try implement AS mode of Presence . mean behind asterisk,
i have successfully configure and installed Presence , but i dnt understand how
to configure this one with…
-
```
hi every one ,
i m IFFI , to try implement AS mode of Presence . mean behind asterisk,
i have successfully configure and installed Presence , but i dnt understand how
to configure this one with…
-
```
hi every one ,
i m IFFI , to try implement AS mode of Presence . mean behind asterisk,
i have successfully configure and installed Presence , but i dnt understand how
to configure this one with…
-
```
hi every one ,
i m IFFI , to try implement AS mode of Presence . mean behind asterisk,
i have successfully configure and installed Presence , but i dnt understand how
to configure this one with…
-
I have tested the inference speed of quantized model and unquantized model, which is first finetuned by my own dataset. I used **AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True, max_seq…
-
Here is the method I am referring to in my code:
def generate_story(scenario):
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = AutoModelForCausalLM.fr…
-
Installing `-f chatqna/gaudi-values.yaml` git HEAD setup with Helm, and then querying ChatQnA:
```
curl http://${host_ip}:8888/v1/chatqna \
-H "Content-Type: application/json" \
-d '{
…
-
It seems that it's not possible to run models using multiple gpus, e.g. by passing `device_map="auto"` to pipelines.
Is there any way to work around this limitation?
-
Hello, we are looking for the best way for deploying TGI on Xeons.
I understand that container images tagged with `x.y.z-intel` are the XPU builds, while `Dockerfile_intel` defines both XPU and CP…