-
Hello. I want to use the method on Chinese dataset, the pre-training model is not doing well. It prefer to generate "Fake" if the input is Chinese and "True" if input is English.
-
Hello,
I would like to request the addition of the MAP-Neo model to your repository. MAP-Neo is the first high-performance, fully open-source bilingual (Chinese and English) LLM. This model include…
-
Hello!I want to finetune your model distiluse-base-multilingual-cased on chinese corpus like LCQMC.
So,do Chinese sentences need word segmentation?
-
Hi,Thanks for the great work, I have a question on the task of acos
I've downloaded the dataset from [here](https://github.com/yangheng95/ABSADatasets) and there's a task called `acos` which I am in…
-
Hi I was trying to fine-tune the model on an unknown language. The Language I was trying to do was [pashto](https://en.wikipedia.org/wiki/Pashto). Now I was aware that GOT-OCR was trained for this lan…
-
I replicated the results of VITS and Matcha-TTS on a single speaker Chinese dataset and found that the timbre similarity of Matcha-TTS is lower than that of VITS, especially in the high-frequency deta…
-
These opened dataset can not really find which dataset can hav img -> markdown text information.
And where does the Chinese OCR ability comes from? The whole dataset has no Chinese,
-
### Feature request
The main feature request involves a New Trainer Subclass, similar to Seq2SeqTrainer, but suitable for Decoder-Only LM.
### Motivation
`Seq2SeqTrainer` provides a great abstracti…
skpig updated
2 months ago
-
"In practice, to save GPU memory, we do not load all Encoders directly onto the GPU but instead load the extracted features“
Does it mean we don't need modality encoder, we already have the llama inp…
-
Can dataformer add a feature,
to determine download open source date from webpage,
or to sythesis dataset by it self
or both