-
**Is your feature request related to a problem? Please describe.**
The vocabularies generated in this repository need to be generated from and/or validated against external resources. Some of these r…
-
Thank you for this interesting study on vocabulary scaling laws.
I'm curious if you ran any experiments comparing the performance of Llama 2 models with larger vocabularies as predicted by your ap…
-
This new method saves a lot of more memory, can you port it to unsloth?
[Cut Your Losses in Large-Vocabulary Language Models](https://arxiv.org/abs/2411.09009)
[https://github.com/apple/ml-cross-ent…
-
@bartvm, @rizar,
I have a few questions about Blocks in order to be able to implement the large-vocabulary models.
How can I access and modify the parameters of the training algorithms (eg the runni…
-
I recently met a problem that the training algorithm becomes much slower when the vocabulary size gets extreme large. There is a warning from tensorflow saying that "Converting sparse IndexedSlices to…
-
This is perhaps a sub-issue of #12 vocabularies. Uniquely identifying a course in a way that human-friendly and end users might be able to understand is a challenge. If we have a large, and perpetuall…
-
First, thanks so much for posting your code!
Counter-fitting glove vectors with a much larger vocab (~2 million words) the number of required dot products for computing VSP pairs obviously explodes…
-
I'm organizing a relay podfic of a large svsss fic, but it has some fic-specific words that are either general vocabulary (that could likely be general) or OCs sort of drawn or adapted from other nove…
-
I've tried to compute word embeddings with a vocabulary size of 6105270 with a dimensionality of 300, resulting in a `NegativeArraySizeException` in `WordEmbeddings.java:100`:
```
weights = new doubl…
-
Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.