-
I wonder what the vocab size of spiece.model (seems 32k)? I am trying to improve this part, could anyone share the vocab size of spiece.model?
Besides, what is the data size which trained to get spi…
-
我用笔记本电脑的GTX1070(显存8G),训练了一个虎嗅新闻语料,约110MB,也能跑,每一步保存模型,时间间隔是5分钟。想看看大家的机器及训练效率,欢迎在下方留言探讨。我的参数设置如下:
{
"architectures": [
"GPT2LMHeadModel"
],
"attn_pdrop": 0.1,
"bos_token_id": 0,
"do…
-
### Description
I was unable to use my own dataset. I was trying to use a dataset having english-hindi translation data. I updated translation_ende.py , text_problems.py and generate_utils.py. I trie…
-
您好,我运行了您的代码,上面显示如图报错,我修改了export路径,但好像还是没把与训练模型读进去,您知道原因吗
-
And can this model be helpful on Chinese dataset?
-
Thank you for your code! When I reproduce the stage 1 trainging, I find that the itm loss does not convergent, is it normal? Or is there any trick? (note: I replace the bert with xlmr model)
-
-
下载了[Linly-Chinese-LLaMA-7b-hf]这个模型文件,使用merge.sh运行报下面错误。是模型文件有误还是其他呢?谢谢了!
bash scripts/merge.sh
===================================BUG REPORT===================================
Welcome to bitsa…
-
Mbrs1.7.1 version The calendar shows garbled characters, PHP and MYSQL have specified UTF8, still the same
Reported by: *anonymous
Original Ticket: [mrbs/support-requests/1759](https://sourceforge.n…
-
1 models--meta-llama--Llama-2-13b-chat-hf/snapshots/0ba94ac9b9e1d5a0037780667e8b219adde1908c/config.json
下载地址:https://huggingface.co/meta-llama/Llama-2-13b-chat-hf
```json
{
"_name_or_path": nu…