-
### What is the issue?
My setup is a 4x A100 80GB, 2TB ram, dual intel cpu. Ubuntu server 22.04.
On a previous version of ollama, the model llama3.1:405b was loaded in a reasonable amount of second…
-
I converted the models to `float32` using this script: https://gist.github.com/pcuenca/23cd08443460bc90854e2a6f0f575084, but found precision problems when targeting `float16`. It'd be interesting to s…
-
Hi there,
I have some doubts about the process behind the `select` method.
1. Is there any detailed explanation about what happens under the hood while using `select` and `gen`?
I mean, can `s…
-
- [ ] [I finally got perfect labels (classification task) via prompting : r/LocalLLaMA](https://www.reddit.com/r/LocalLLaMA/comments/1amvfua/i_finally_got_perfect_labels_classification_task/)
# TIT…
-
Post questions here for this week's oritenting readings: Veitch, Victor, Dhanya Sridhar & David M. Blei. 2020. “Adapting Text Embeddings for Causal Inference.” Proceedings of the 36th Conference on Un…
-
## Describe the question
In Diarization task, i train on AMI train-dev set and ICSI corpus , i test on AMI test set. Both datasets include audios of 3-5 speakers in 50-70 minutes. My d embedding tra…
-
@ggerganov do you have any interest in producing more models in GGML format?
I'm now convinced your approach of zero dependency, no memory allocation cpu-first ideaology will make it accessible to…
-
SINGA has multiple example models at http://singa.apache.org/docs/examples/
Some are implemented from scratch and some are converted from ONNX, which has a bigger model zoo https://github.com/onnx/m…
-
(next 2 comments are for max-autotune, warm start run)
AMP RUN
~~~
+------------------------+------------+-------------+-------------+
| Compiler | torchbench | huggingface | tim…
-
If I understand correctly the idea should be that model generate belief states, dbsearch results, action and response conditioned on some dialog context. Then shouldn't we mask the context in between …