Tongjilibo / bert4torch

An elegent pytorch implement of transformers
https://bert4torch.readthedocs.io/
MIT License
1.2k stars 152 forks source link
belle bert bert4keras bert4torch chatglm large-language-models llama llm named-entity-recognition nlp pytorch relation-extraction seq2seq text-classification transformers

bert4torch

licence GitHub release PyPI PyPI - Downloads GitHub stars GitHub Issues contributions welcome Generic badge

Documentation | Torch4keras | Examples | build_MiniLLM_from_scratch | bert4vector

目录

1. 下载安装

安装稳定版

pip install bert4torch

安装最新版

pip install git+https://github.com/Tongjilibo/bert4torch

2. 功能

功能 bert4torch transformers 备注
训练进度条 进度条打印loss和定义的metrics
分布式训练dp/ddp torch自带dp/ddp
各类callbacks 日志/tensorboard/earlystop/wandb等
大模型推理,stream/batch输出 各个模型是通用的,无需单独维护脚本
大模型微调 lora依赖peft库,pv2自带
丰富tricks 对抗训练等tricks即插即用
代码简洁易懂,自定义空间大 代码复用度高, keras代码训练风格
仓库的维护能力/影响力/使用量/兼容性 目前仓库个人维护
一键部署大模型

3. 快速上手

3.1 上手教程

3.2 命令行快速部署大模型服务

4. 版本和更新历史

4.1 版本历史

更新日期 bert4torch torch4keras 版本说明
20240814 0.5.3 0.2.6 【新功能】增加llama3.1/Yi1.5;自动选择从hfmirror下载;支持命令行参数bert4torch-llm-server
20240801 0.5.2 0.2.5 【新功能】chatglm/qwen系列支持function call调用, 增加internlm2系列;【小优化】简化pipeline中chat demo的调用,generate的终止token元素允许为列表, 统一rope_scaling参数名,增加rope衍生类;【bug】修复flash_attn2的推理bug, 修复bart的tie_word_embedding的bug
20240619 0.5.1 0.2.4 增加Qwen1.5, Qwen2, glm4; 增加SWA/convert_lm_logits_dtype;调整各个trainer(重点DPOTrainer), generation中segment_ids, repetition_penalty需带query, RMSNorm中转类型bug

更多版本

4.2 更新历史

更多历史

5. 预训练权重

模型分类 模型名称 权重来源 权重链接/checkpoint_path config_path
bert bert-base-chinese google-bert bert-base-chinese bert-base-chinese
chinese_L-12_H-768_A-12 谷歌 tf权重
Tongjilibo/bert-chinese_L-12_H-768_A-12
chinese-bert-wwm-ext HFL hfl/chinese-bert-wwm-ext chinese-bert-wwm-ext
bert-base-multilingual-cased google-bert bert-base-multilingual-cased bert-base-multilingual-cased
MacBERT HFL hfl/chinese-macbert-base
hfl/chinese-macbert-large
chinese-macbert-base
chinese-macbert-large
WoBERT 追一科技 junnyu/wobert_chinese_basejunnyu/wobert_chinese_plus_base wobert_chinese_base
wobert_chinese_plus_base
roberta chinese-roberta-wwm-ext HFL hfl/chinese-roberta-wwm-ext
hfl/chinese-roberta-wwm-ext-large
(large的mlm权重是随机初始化)
chinese-roberta-wwm-ext
chinese-roberta-wwm-ext-large
roberta-small/tiny 追一科技 Tongjilibo/chinese_roberta_L-4_H-312_A-12
Tongjilibo/chinese_roberta_L-6_H-384_A-12
roberta-base FacebookAI roberta-base roberta-base
guwenbert ethanyt ethanyt/guwenbert-base guwenbert-base
albert albert_zh
albert_pytorch
brightmart voidful/albert_chinese_tiny
voidful/albert_chinese_small
voidful/albert_chinese_base
voidful/albert_chinese_large
voidful/albert_chinese_xlarge
voidful/albert_chinese_xxlarge
albert_chinese_tinyalbert_chinese_small
albert_chinese_base
albert_chinese_large
albert_chinese_xlarge
albert_chinese_xxlarge
nezha NEZHA
NeZha_Chinese_PyTorch
huawei_noah sijunhe/nezha-cn-base
sijunhe/nezha-cn-large
sijunhe/nezha-base-wwm
sijunhe/nezha-large-wwm
nezha-cn-base
nezha-cn-large
nezha-base-wwm
nezha-large-wwm
nezha_gpt_dialog bojone Tongjilibo/nezha_gpt_dialog
xlnet Chinese-XLNet HFL hfl/chinese-xlnet-base chinese-xlnet-base
tranformer_xl huggingface transfo-xl/transfo-xl-wt103 transfo-xl-wt103
deberta Erlangshen-DeBERTa-v2 IDEA IDEA-CCNL/Erlangshen-DeBERTa-v2-97M-Chinese
IDEA-CCNL/Erlangshen-DeBERTa-v2-320M-Chinese
IDEA-CCNL/Erlangshen-DeBERTa-v2-710M-Chinese
Erlangshen-DeBERTa-v2-97M-Chinese
Erlangshen-DeBERTa-v2-320M-Chinese
Erlangshen-DeBERTa-v2-710M-Chinese
electra Chinese-ELECTRA HFL hfl/chinese-electra-base-discriminator chinese-electra-base-discriminator
ernie ernie 百度文心 nghuyong/ernie-1.0-base-zh
nghuyong/ernie-3.0-base-zh
ernie-1.0-base-zh
ernie-3.0-base-zh
roformer roformer 追一科技 junnyu/roformer_chinese_base roformer_chinese_base
roformer_v2 追一科技 junnyu/roformer_v2_chinese_char_base roformer_v2_chinese_char_base
simbert simbert 追一科技 Tongjilibo/simbert-chinese-base
Tongjilibo/simbert-chinese-small
Tongjilibo/simbert-chinese-tiny
simbert_v2/roformer-sim 追一科技 junnyu/roformer_chinese_sim_char_basejunnyu/roformer_chinese_sim_char_ft_basejunnyu/roformer_chinese_sim_char_smalljunnyu/roformer_chinese_sim_char_ft_small roformer_chinese_sim_char_base
roformer_chinese_sim_char_ft_base
roformer_chinese_sim_char_small
roformer_chinese_sim_char_ft_small
gau GAU-alpha 追一科技 Tongjilibo/chinese_GAU-alpha-char_L-24_H-768
uie uie
uie_pytorch
百度 Tongjilibo/uie-base
gpt CDial-GPT thu-coai thu-coai/CDial-GPT_LCCC-base
thu-coai/CDial-GPT_LCCC-large
CDial-GPT_LCCC-base
CDial-GPT_LCCC-large
cmp_lm(26亿) 清华 TsinghuaAI/CPM-Generate CPM-Generate
nezha_gen huawei_noah Tongjilibo/chinese_nezha_gpt_L-12_H-768_A-12
gpt2-chinese-cluecorpussmall UER uer/gpt2-chinese-cluecorpussmall gpt2-chinese-cluecorpussmall
gpt2-ml imcaspar torch
BaiduYun(84dh)
gpt2-ml_15g_corpus
gpt2-ml_30g_corpus
bart bart_base_chinese 复旦fnlp fnlp/bart-base-chinese
v1.0
bart-base-chinese
bart-base-chinese-v1.0
t5 t5 UER uer/t5-small-chinese-cluecorpussmall
uer/t5-base-chinese-cluecorpussmall
t5-base-chinese-cluecorpussmall
t5-small-chinese-cluecorpussmall
mt5 谷歌 google/mt5-base mt5-base
t5_pegasus 追一科技 Tongjilibo/chinese_t5_pegasus_small
Tongjilibo/chinese_t5_pegasus_base
chatyuan clue-ai ClueAI/ChatYuan-large-v1
ClueAI/ChatYuan-large-v2
ChatYuan-large-v1
ChatYuan-large-v2
PromptCLUE clue-ai ClueAI/PromptCLUE-base PromptCLUE-base
chatglm chatglm-6b THUDM THUDM/chatglm-6b
THUDM/chatglm-6b-int8
THUDM/chatglm-6b-int4
v0.1.0
chatglm-6b
chatglm-6b-int8
chatglm-6b-int4
chatglm-6b-v0.1.0
chatglm2-6b THUDM THUDM/chatglm2-6b
THUDM/chatglm2-6b-int4
THUDM/chatglm2-6b-32k
chatglm2-6b
chatglm2-6b-int4
chatglm2-6b-32k
chatglm3-6b THUDM THUDM/chatglm3-6b
THUDM/chatglm3-6b-32k
chatglm3-6b
chatglm3-6b-32k
glm4-9b THUDM THUDM/glm-4-9b
THUDM/glm-4-9b-chat
THUDM/glm-4-9b-chat-1m
glm-4-9b
glm-4-9b-chat
glm-4-9b-chat-1m
llama llama meta llama-7b
llama-13b
llama-2 meta meta-llama/Llama-2-7b-hf
meta-llama/Llama-2-7b-chat-hf
meta-llama/Llama-2-13b-hf
meta-llama/Llama-2-13b-chat-hf
Llama-2-7b-hf
Llama-2-7b-chat-hf
Llama-2-13b-hf
Llama-2-13b-chat-hf
llama-3 meta meta-llama/Meta-Llama-3-8B
meta-llama/Meta-Llama-3-8B-Instruct
Meta-Llama-3-8B
Meta-Llama-3-8B-Instruct
llama-3.1 meta meta-llama/Meta-Llama-3.1-8B
meta-llama/Meta-Llama-3.1-8B-Instruct
Meta-Llama-3.1-8B
Meta-Llama-3.1-8B-Instruct
Chinese-LLaMA-Alpaca HFL chinese_alpaca_plus_7b
chinese_llama_plus_7b
Chinese-LLaMA-Alpaca-2 HFL 待添加
Chinese-LLaMA-Alpaca-3 HFL 待添加
Belle_llama LianjiaTech BelleGroup/BELLE-LLaMA-7B-2M-enc 合成说明BELLE-LLaMA-7B-2M-enc
Ziya IDEA-CCNL IDEA-CCNL/Ziya-LLaMA-13B-v1
IDEA-CCNL/Ziya-LLaMA-13B-v1.1
IDEA-CCNL/Ziya-LLaMA-13B-Pretrain-v1
Ziya-LLaMA-13B-v1
Ziya-LLaMA-13B-v1.1
Baichuan baichuan-inc baichuan-inc/Baichuan-7B
baichuan-inc/Baichuan-13B-Base
baichuan-inc/Baichuan-13B-Chat
Baichuan-7B
Baichuan-13B-Base
Baichuan-13B-Chat
Baichuan2 baichuan-inc baichuan-inc/Baichuan2-7B-Base
baichuan-inc/Baichuan2-7B-Chat
baichuan-inc/Baichuan2-13B-Base
baichuan-inc/Baichuan2-13B-Chat
Baichuan2-7B-Base
Baichuan2-7B-Chat
Baichuan2-13B-Base
Baichuan2-13B-Chat
vicuna lmsys lmsys/vicuna-7b-v1.5 vicuna-7b-v1.5
Yi 01-ai 01-ai/Yi-6B
01-ai/Yi-6B-200K
01-ai/Yi-9B
01-ai/Yi-9B-200K
Yi-6B
Yi-6B-200K
Yi-9B
Yi-9B-200K
Yi-1.5 01-ai 01-ai/Yi-1.5-6B
01-ai/Yi-1.5-6B-Chat
01-ai/Yi-1.5-9B
01-ai/Yi-1.5-9B-32K
01-ai/Yi-1.5-9B-Chat
01-ai/Yi-1.5-9B-Chat-16K
Yi-1.5-6B
Yi-1.5-6B-Chat
Yi-1.5-9B
Yi-1.5-9B-32K
Yi-1.5-9B-Chat
Yi-1.5-9B-Chat-16K
bloom bloom bigscience bigscience/bloom-560m
bigscience/bloomz-560m
bloom-560m
bloomz-560m
Qwen Qwen 阿里云 Qwen/Qwen-1_8B
Qwen/Qwen-1_8B-Chat
Qwen/Qwen-7B
Qwen/Qwen-7B-Chat
Qwen/Qwen-14B
Qwen/Qwen-14B-Chat
Qwen-1_8B
Qwen-1_8B-Chat
Qwen-7B
Qwen-7B-Chat
Qwen-14B
Qwen-14B-Chat
Qwen1.5 阿里云 Qwen/Qwen1.5-0.5B
Qwen/Qwen1.5-0.5B-Chat
Qwen/Qwen1.5-1.8B
Qwen/Qwen1.5-1.8B-Chat
Qwen/Qwen1.5-7B
Qwen/Qwen1.5-7B-Chat
Qwen/Qwen1.5-14B
Qwen/Qwen1.5-14B-Chat
Qwen1.5-0.5B
Qwen1.5-0.5B-Chat
Qwen1.5-1.8B
Qwen1.5-1.8B-Chat
Qwen1.5-7B
Qwen1.5-7B-Chat
Qwen1.5-14B
Qwen1.5-14B-Chat
Qwen2 阿里云 Qwen/Qwen2-0.5B
Qwen/Qwen2-0.5B-Instruct
Qwen/Qwen2-1.5B
Qwen/Qwen2-1.5B-Instruct
Qwen/Qwen2-7B
Qwen/Qwen2-7B-Instruct
Qwen2-0.5B
Qwen2-0.5B-Instruct
Qwen2-1.5B
Qwen2-1.5B-Instruct
Qwen2-7B
Qwen2-7B-Instruct
InternLM InternLM 上海人工智能实验室 internlm/internlm-chat-7b
internlm/internlm-7b
internlm-7b
internlm-chat-7b
InternLM2 上海人工智能实验室 internlm/internlm2-1_8b
internlm/internlm2-chat-1_8b
internlm/internlm2-7b
internlm/internlm2-chat-7b
internlm/internlm2-20b
internlm/internlm2-chat-20b
internlm2-1_8b
internlm2-chat-1_8b
internlm2-7b
internlm2-chat-7b
InternLM2.5 上海人工智能实验室 internlm/internlm2_5-7b
internlm/internlm2_5-7b-chat
internlm/internlm2_5-7b-chat-1m
internlm2_5-7b
internlm2_5-7b-chat
internlm2_5-7b-chat-1m
Falcon Falcon tiiuae tiiuae/falcon-rw-1b
tiiuae/falcon-7b
tiiuae/falcon-7b-instruct
falcon-rw-1b
falcon-7b
falcon-7b-instruct
DeepSeek DeepSeek-MoE 幻方量化 deepseek-ai/deepseek-moe-16b-base
deepseek-ai/deepseek-moe-16b-chat
deepseek-moe-16b-base
deepseek-moe-16b-chat
DeepSeek-LLM 幻方量化 deepseek-ai/deepseek-llm-7b-base
deepseek-ai/deepseek-llm-7b-chat
deepseek-llm-7b-base
deepseek-llm-7b-chat
DeepSeek-V2 幻方量化 deepseek-ai/DeepSeek-V2-Lite
deepseek-ai/DeepSeek-V2-Lite-Chat
DeepSeek-Coder 幻方量化 待添加
DeepSeek-Coder-V2 幻方量化 待添加
MiniCPM MiniCPM OpenBMB openbmb/MiniCPM-2B-sft-bf16
openbmb/MiniCPM-2B-dpo-bf16
openbmb/MiniCPM-2B-128k
openbmb/MiniCPM-1B-sft-bf16
MiniCPM-2B-sft-bf16
MiniCPM-2B-dpo-bf16
MiniCPM-2B-128k
MiniCPM-1B-sft-bf16
MiniCPM-V OpenBMB 待添加
embedding text2vec-base-chinese shibing624 shibing624/text2vec-base-chinese text2vec-base-chinese
m3e moka-ai moka-ai/m3e-base m3e-base
bge BAAI BAAI/bge-large-en-v1.5
BAAI/bge-large-zh-v1.5
BAAI/bge-base-en-v1.5
BAAI/bge-base-zh-v1.5
BAAI/bge-small-en-v1.5
BAAI/bge-small-zh-v1.5
bge-large-en-v1.5
bge-large-zh-v1.5
bge-base-en-v1.5
bge-base-zh-v1.5
bge-small-en-v1.5
bge-small-zh-v1.5
gte thenlper thenlper/gte-large-zh
thenlper/gte-base-zh
gte-base-zh
gte-large-zh

*注:

  1. 高亮格式(如bert-base-chinese)的表示可直接build_transformer_model()联网下载
  2. 国内镜像网站加速下载
    • HF_ENDPOINT=https://hf-mirror.com python your_script.py
    • export HF_ENDPOINT=https://hf-mirror.com后再执行python代码
    • 在python代码开头如下设置
      import os
      os.environ['HF_ENDPOINT'] = "https://hf-mirror.com"

6. 鸣谢

7. 引用

@misc{bert4torch,
  title={bert4torch},
  author={Bo Li},
  year={2022},
  howpublished={\url{https://github.com/Tongjilibo/bert4torch}},
}

8. 其他

pic
微信号
pic
微信群
pic
Star History Chart