-
Hi, I was wondering if this method can be used for trimming large vocabulary in LLMs. Can vocab trimmer be extended to LLMs?
-
(Issue found in the EarthPortal, using code from this repository up to October 2023)
When loading a SKOS vocabulary that have a large amount of topConcept (the vocabulary is just a huge list of terms…
-
Hi. I'm using Top2Vec for a project and this is how I have configured the model:
```
model = Top2Vec(documents = texts_unified,
min_count=10,
topic_merge_delta…
-
### Metadata
- Authors: Wenlin Chen, David Grangier and Michael Auli
- Organization: Washington University and Facebook AI Research
- Conference: ACL 2016
- Paper: https://arxiv.org/pdf/1512.04906…
-
Description
We need to create a "Core Board" for the cboard-ai-engine based on Core Vocabulary, using a specific prompt. The Core Board should focus on high-frequency words that provide broad utility…
-
Using TabbyAPI/exllamav2 with Llama3.1-8B
Threadripper Pro/A6000 GPU
Inference at ~70t/s unconstrained, single request. ~35t/s with lm-format-enforcer (JSON schema)
Running 30 simultaneous reques…
-
### Have you searched existing issues? 🔎
- [X] I have searched and found no existing issues
### Desribe the bug
I have built a model with partial_fit (using the code found in the documentation). …
-
Hello, for testing reasons we wanted to see if approximate vocabulary was faster than vocabulary when there are many features (we have 36 features to analyze). In the past we hit the graph too large e…
-
Look again at why exactly large pages are taking so long to render. One example had the /ns/creator main page taking 24 seconds.
Local profiling could help pinpoint problems. Check out https://gith…
wickr updated
5 years ago
-
Since the latest models, such as Llama 3 and Gemma, adopt extremely large vocabularies (128-256K), the size of logits can become very large, consuming a large proportion of VRAM. For example, the foll…