-
Hello, thank you for your hard work.
Does this project also have a zeroshot function for reference voice?
Thank you.
-
Currently, we only compile for base X86. It would be good to also have instructions from newer ISA extensions like AVX512 in the corpus. This will require some changes in the data collection methodolo…
-
Hi,
I've always been used to the old .fit behaviour where I could pass in the good DataLoader, implementing the Dataset myself, according to my needs.
With the new trainer interface, how am I su…
-
Hi
I have a few doubts related to the code:
**Doubt 1**
I downloaded the SST dataset and put it into the directory ```./data/dataset/sst/SST-2/```. This ```SST-2``` folder contains ```dev.tsv```,…
-
- [x] Acquire Source
- [x] Supporting Assets
- [x] Extract Images
- [x] Sort Images
- [x] Create Image Tokens
- [ ] Create Dynamic Tokens
- [x] Create Text Tokens
- [x] Prepare Map Assets
- [ ]…
-
> wie oft das gesuchte Token gefunden wurde und vielleicht wieviel % dies vom Gesamtkorpus ausmacht
This may be something that NoSkE already reports.
-
Hello Victor.
I would like to thank u first for your contribution.
I am trying to retrain your model but the aggregate_paraphrase_corpus_0 is missing,
Could you share me the files or maybe expla…
-
Since most development that happens using VZCode is tailored towards JavaScript and data visualization, it would be amazing to use RAG (Retreival Augmented Generation) based on a pre-defined corpus of…
-
I am following the [getting started](https://epfllm.github.io/Megatron-LLM/guide/getting_started.html) guide with mistal-7B model.
- I am able to (1) convert `mistralai/Mistral-7B-v0.1` and (2) …
-
Retraining on checkpoint works perfectly with the tokenization on the fly, but breaks while using nanoset: training restart with a different lr, which is not the same as lr_schedule.pt
We also have…