-
If there are Korean characters in the path of the tokenizer model when it's loaded like this:
`sentencepiece::SentencePieceProcessor tokenProcessor;`
`tokenProcessor.load(pathtomodel);`
Any help…
-
Hello everyone! I found Llama models like `beomi/llama-2-ko-7b` are giving junk output like `\n[/INST]\n\n[/INST]...`. I tried with multiple Llama2 korean models and I am getting similar junk results.…
-
Hi,
I tried v5 pretrain this data
(https://huggingface.co/datasets/eaglewatch/Korean_Wikipedia_Dataset_for_GPT2_August_2022)
And I am using this script.
```
python train.py --data_file /wor…
-
I'm trying to calculate the blue score for a low resource language, so I'm using a tokenizer that I've trained myself, is there a way to pass the tokenizer as a param?
for now when I am passing the …
-
### Context and Issue
I'm attempting to quantize the model [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b) using [royallab/PIPPA-cleaned](https://huggingface.co/datasets/royal…
-
Hey guys,
Bumped into the issue with openssl. I've already installed 2.7.2 sucessfully, but no luck with 2.6.3. `libssl-dev` is already installed
```
snake@mothership:~$ uname -a
Linux mothershi…
-
I am trying to train the conformer-ctc model with the Ksponspeech dataset, which is a Korean speaking dataset.
Ksponspeech - 1000hours / 123GB / 630000 pcm audio files ( fs=16000 / sample_width = 2…
-
### System Info
```shell
- `transformers` version: 4.17.0.dev0
- Platform: Linux-4.15.0-176-generic-x86_64-with-glibc2.17
- Python version: 3.8.13
- PyTorch version (GPU?): 1.8.2 (True)
- Tens…
-
Hi. I'm trying to convert the 'kfkas/Llama-2-ko-7b-Chat' model I received from huggingface on Windows 11 into a gguf file.
So I tried to convert it to the command below.
C:\AI\llama.cpp>python con…
-
### Describe the bug
There is a tutorial of korean version of coqui_tts.(not written by coqui-ai)
tutorial link : https://colab.research.google.com/drive/1hv37sT7Pq-qKZe9Ihbbp5XZ-A9tsURli?usp=sharin…