关于CPM2模型生成的问题

jiayuchennlp commented 3 years ago

请问有CPM-2做生成的样例吗？类似CPM-1-Generate（https://github.com/TsinghuaAI/CPM-1-Generate）

t1101675 commented 3 years ago

CPM-2 模型直接做生成效果不佳，您可以尝试近期发布的生成强化版模型 CPM2.1，同时建议配合高效推理框架使用 https://github.com/OpenBMB/BMInf

chenjunqiang commented 3 years ago

CPM-2 模型直接做生成效果不佳，您可以尝试近期发布的生成强化版模型 CPM2.1，同时建议配合高效推理框架使用 https://github.com/OpenBMB/BMInf

是否已经发布了呢》？链接在哪里呢？

jiayuchennlp commented 3 years ago

您好，高效推理框架目前支持CPM2.1, CPM1 和EVA，后续是否会支持CPM2-MOE这样的大模型呢？ @t1101675

t1101675 commented 2 years ago

CPM-2 模型直接做生成效果不佳，您可以尝试近期发布的生成强化版模型 CPM2.1，同时建议配合高效推理框架使用 https://github.com/OpenBMB/BMInf

是否已经发布了呢》？链接在哪里呢？

这个可以在这里下载 https://wudaoai.cn/model/detail/CPM%E7%B3%BB%E5%88%97#download

t1101675 commented 2 years ago

您好，高效推理框架目前支持CPM2.1, CPM1 和EVA，后续是否会支持CPM2-MOE这样的大模型呢？ @t1101675

这个目前还在实现，不过我们之前发布的一个 InfMoE 已经可以实现单卡的 MOE 模型运行了，如果您有需求可以尝试下这个。https://github.com/TsinghuaAI/InfMoE

XiaoqingNLP commented 2 years ago

这个可以在这里下载 https://wudaoai.cn/model/detail/CPM%E7%B3%BB%E5%88%97#download

我在这里下载的，在BMinf中加载有问题，而BMInf 中提供的模型链接，下载以后没问题，请问这个是什么原因呢？ @t1101675

zzy14 commented 2 years ago

@Qnlp Hi，BMInf不支持加载torch的checkpoint，需要使用BMInf 提供的模型。如果您在wudao下载的是int8的模型，按道理来说是没问题的，想问一下报的是什么错呢？

XiaoqingNLP commented 2 years ago

@zzy14

错误如下：

Traceback (most recent call last):
  File "/path/to/pretrain/BMInf/examples/generate_cpm2.py", line 53, in <module>
    main()
  File "/path/to/pretrain/BMInf/examples/generate_cpm2.py", line 45, in main
    cpm2_1 = bminf.models.CPM2()
  File "/path/to/pretrain/BMInf/bminf/models/cpm2.py", line 60, in __init__
    super().__init__(config)
  File "/path/to/pretrain/BMInf/bminf/arch/t5/model.py", line 82, in __init__
    self.load( open(model_path, "rb") )
  File "/path/to/pretrain/BMInf/bminf/layers/base.py", line 81, in load
    self._sub_layers[name].load(fp)
  File "/path/to/pretrain/BMInf/bminf/layers/base.py", line 81, in load
    self._sub_layers[name].load(fp)
  File "/path/to/pretrain/BMInf/bminf/layers/base.py", line 81, in load
    self._sub_layers[name].load(fp)
  [Previous line repeated 1 more time]
  File "/path/to/pretrain/BMInf/bminf/layers/base.py", line 75, in load
    name = load_string(fp)
  File "/path/to/pretrain/BMInf/bminf/layers/base.py", line 20, in load_string
    size = struct.unpack("I", fp.read(4))[0]
struct.error: unpack requires a buffer of 4 bytes

zzy14 commented 2 years ago

可能这个模型是torch模型🤣，我咨询了BMInf的开发人员，他的建议是直接使用BMInf，工具包会自动下载。

TsinghuaAI / CPM-2-Pretrain

关于CPM2模型生成的问题 #16