PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.09k stars 5.55k forks source link

fluid加载预训练模型,输出参数权重 #45165

Closed yysirs closed 1 year ago

yysirs commented 2 years ago

请提出你的问题 Please ask your question

想看下RocketQA中zh_dureader_ce预训练模型中的权重参数,但是只使用get_program_parameter,只获取了参数名称,权重的API在文档中没有找到

from rocketqa.model.ernie import ErnieModel, ErnieConfig
from paddle import fluid
max_seq_len = 384
    num_labels = 2
    task_name = '_dureader_cross_encoder'
    pyreader_name = 'test_reader'
    is_noise = False
    ernie_config = ErnieConfig(os.path.join(input_dir, 'zh_config.json'))
    place = fluid.CPUPlace()
    # dev_count = int(os.environ.get('CPU_NUM', multiprocessing.cpu_count()))
    exe = fluid.Executor(place) #确定使用cpu还是gpu
    startup_prog = fluid.Program()
    test_prog = fluid.Program()
    with fluid.program_guard(test_prog, startup_prog):
        with fluid.unique_name.guard():
            pyreader = fluid.layers.py_reader(
                capacity=50,
                shapes=[[-1, max_seq_len, 1], [-1, max_seq_len, 1],
                        [-1, max_seq_len, 1], [-1, max_seq_len, 1],
                        [-1, max_seq_len, 1], [-1, 1], [-1, 1]],
                dtypes=[
                    'int64', 'int64', 'int64', 'int64', 'float32', 'int64', 'int64'
                ],
                lod_levels=[0, 0, 0, 0, 0, 0, 0],
                name=task_name + "_" + pyreader_name,
                use_double_buffer=True)

            (src_ids, sent_ids, pos_ids, task_ids, input_mask, labels,
            qids) = fluid.layers.read_file(pyreader)

            ernie = ErnieModel(
                src_ids=src_ids,
                position_ids=pos_ids,
                sentence_ids=sent_ids,
                task_ids=task_ids,
                input_mask=input_mask,
                config=ernie_config,
                is_noise=is_noise)
            cls_feats = ernie.get_pooled_output()

            if not is_noise:
                cls_feats = fluid.layers.dropout(
                x=cls_feats,
                dropout_prob=0.1,
                dropout_implementation="upscale_in_train")

            logits = fluid.layers.fc(
                input=cls_feats,
                size=num_labels,
                param_attr=fluid.ParamAttr(
                    name=task_name + "_cls_out_w",
                    initializer=fluid.initializer.TruncatedNormal(scale=0.02)),
                bias_attr=fluid.ParamAttr(
                    name=task_name + "_cls_out_b",
                    initializer=fluid.initializer.Constant(0.)))
            probs = fluid.layers.softmax(logits)
            graph_vars = {
                "probs": probs,
            }

    exe.run(startup_prog) # 参数初始化
    init_pretraining_params(exe, os.path.join(input_dir, 'dureader_cross_encoder'), startup_prog)
    list_para = fluid.io.get_program_parameter(startup_prog)
paddle-bot[bot] commented 2 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

yysirs commented 2 years ago
for each_block in startup_prog.blocks:
        for each_var in list(each_block.vars.values()):
            print(each_var)
输出:
persist var _dureader_cross_encoder_test_reader_reader : READER)
persist trainable param word_embedding : LOD_TENSOR.shape(18000, 768).dtype(float32).stop_gradient(False)
persist trainable param pos_embedding : LOD_TENSOR.shape(513, 768).dtype(float32).stop_gradient(False)
persist trainable param sent_embedding : LOD_TENSOR.shape(2, 768).dtype(float32).stop_gradient(False)
persist trainable param pre_encoder_layer_norm_scale : LOD_TENSOR.shape(768,).dtype(float32).stop_gradient(False)
persist trainable param pre_encoder_layer_norm_bias : LOD_TENSOR.shape(768,).dtype(float32).stop_gradient(False)
persist trainable param encoder_layer_0_multi_head_att_query_fc.w_0 : LOD_TENSOR.shape(768, 768).dtype(float32).stop_gradient(False)
persist trainable param encoder_layer_0_multi_head_att_query_fc.b_0 : LOD_TENSOR.shape(768,).dtype(float32).stop_gradient(False)

上面输出是权重嘛?但是和平时看的不太一样,如何转换成np.array类型的权重?

tink2123 commented 2 years ago

可以参考https://paddlepaddle.org.cn/documentation/docs/zh/faq/train_cn.html#q-numpyfcw 打印出某层的weight变量对应的tensor值。

具体的print(numpy.array(fluid.global_scope().find_var("weight_name").get_tensor()))

paddle-bot[bot] commented 1 year ago

Since you haven\'t replied for more than a year, we have closed this issue/pr. If the problem is not solved or there is a follow-up one, please reopen it at any time and we will continue to follow up. 由于您超过一年未回复,我们将关闭这个issue/pr。 若问题未解决或有后续问题,请随时重新打开,我们会继续跟进。