PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.23k stars 5.58k forks source link

Ubuntu18.04从1.8+cuda10.1升级到2.0+cuda11.0后运行ernie的mrc模型预测任务报错 #29743

Closed 1973sir closed 3 years ago

1973sir commented 3 years ago

之前在1.8+cuda10.1环境下可以正常运行,升级到2.0rc+cuda11.0(为了支持rtx3080显卡)后运行报错: Traceback (most recent call last): File "knowledge_core/test.py", line 26, in mrc.main(args) File "/home/lenny/share/tnhb/run_mrc.py", line 73, in main is_training=False) File "/home/lenny/share/tnhb/finetune/mrc.py", line 49, in create_model use_double_buffer=True) File "/home/lenny/anaconda3/envs/t18p20/lib/python3.7/site-packages/paddle/fluid/layers/io.py", line 724, in py_reader use_double_buffer=use_double_buffer) File "/home/lenny/anaconda3/envs/t18p20/lib/python3.7/site-packages/paddle/fluid/layers/io.py", line 452, in _py_reader 'ranks': ranks File "/home/lenny/anaconda3/envs/t18p20/lib/python3.7/site-packages/paddle/fluid/framework.py", line 2837, in append_op kwargs.get("stop_gradient", False)) File "/home/lenny/anaconda3/envs/t18p20/lib/python3.7/site-packages/paddle/fluid/dygraph/tracer.py", line 45, in trace_op not stop_gradient) ValueError: (InvalidArgument) Python object is not type of St10shared_ptrIN6paddle10imperative7VarBaseEE (at /paddle/paddle/fluid/pybind/imperative.cc:220) [Hint: If you need C++ stacktraces for debugging, please set FLAGS_call_stack_level=2.] 请问这是什么原因呢?谢谢!

paddle-bot-old[bot] commented 3 years ago

您好,我们已经收到了您的问题,会安排技术人员尽快解答您的问题,请耐心等待。请您再次检查是否提供了清晰的问题描述、复现代码、环境&版本、报错信息等。同时,您也可以通过查看官网API文档常见问题历史IssueAI社区来寻求解答。祝您生活愉快~

Hi! We've received your issue and please be patient to get responded. We will arrange technicians to answer your questions as soon as possible. Please make sure that you have posted enough message to demo your request. You may also check out the APIFAQGithub Issue and AI community to get the answer.Have a nice day!

hutuxian commented 3 years ago

贴一下报错部分前后的代码?

1973sir commented 3 years ago

贴一下报错部分前后的代码?

test.py代码:

import sys sys.path.append('/home/lenny/share/tnhb') import run_mrc as mrc import knowledge_core.info_driver as driver import paddle.fluid as fluid from finetune_args import parser import os

def p_probability(mrc_result): if not len(mrcresult): return for in range(min(len(mrc_result[1]), 5)):

print(_)

    # print(mrc_result[1][_])
    print('reply{:}:{},probability:{:.3f}'.format(_, mrc_result[1][_]['text'], mrc_result[1][_]['probability']))
return

if name == 'main': os.chdir(os.path.abspath(os.path.dirname(file)).split('knowledge_core')[0]) args = parser.parse_args() p = driver.BfOrGf('/home/lenny/share/tnhb/knowledge_core/background/gf.json') scope = fluid.core.Scope() back_info = p.backinfo + '回答是(没有答案)。观点选项(是的),(不是),(不一定)。'

back_info = '今天我去图书馆借书来看。 位于馆内角落的书架是小说类别的,我对小说很感兴趣因此便去那选书。 当我出抽第一本书时,书页却零落的掉了下来。 此时我与书架对面的人在我抽出书本时眼神交会了一会。 我继续在书架前花了约10钟左右选出3本看起来挺有趣的书,拿到柜台请管理员办理借书手续。在回家的路上我心理想着应该也把第一本书借回来的,但是书变成那样也没办法了..... 今天就借这些书回家吧回答是(没有答案)。观点选项(是的),(不是),(不一定)。'

print(back_info)
with fluid.scope_guard(scope):
    mrc.main(args)
    while True:
        question = input('question:')

        mrc_result = mrc.abc(context=back_info, question=question)
        # print(mrc_result[0]['text'])

        p_probability(mrc_result)
hutuxian commented 3 years ago

额,报错内容主要在mrc.py的49行~ 顺便问一下,您用的是静态图还是动态图?

1973sir commented 3 years ago

额,报错内容主要在mrc.py的49行~ 顺便问一下,您用的是静态图还是动态图?

应该是用的静态图吧,另外报错的49行代码是:use_double_buffer=True),相关function完整代码如下: def create_model(args, pyreader_name, ernie_config, is_training): pyreader = fluid.layers.py_reader( capacity=50, shapes=[[-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, args.max_seq_len, 1], [-1, 1], [-1, 1], [-1, 1]], dtypes=[ 'int64', 'int64', 'int64', 'int64', 'float32', 'int64', 'int64', 'int64' ], lod_levels=[0, 0, 0, 0, 0, 0, 0, 0], name=pyreader_name, use_double_buffer=True) #这里是报错的第49行 (src_ids, sent_ids, pos_ids, task_ids, input_mask, start_positions, end_positions, unique_id) = fluid.layers.read_file(pyreader)

ernie = ErnieModel(
    src_ids=src_ids,
    position_ids=pos_ids,
    sentence_ids=sent_ids,
    task_ids=task_ids,
    input_mask=input_mask,
    config=ernie_config,
    use_fp16=args.use_fp16)

enc_out = ernie.get_sequence_output()
enc_out = fluid.layers.dropout(
    x=enc_out, dropout_prob=0.1, dropout_implementation="upscale_in_train")

logits = fluid.layers.fc(
    input=enc_out,
    size=2,
    num_flatten_dims=2,
    param_attr=fluid.ParamAttr(
        name="cls_mrc_out_w",
        initializer=fluid.initializer.TruncatedNormal(scale=0.02)),
    bias_attr=fluid.ParamAttr(
        name="cls_mrc_out_b", initializer=fluid.initializer.Constant(0.)))

logits = fluid.layers.transpose(x=logits, perm=[2, 0, 1])
start_logits, end_logits = fluid.layers.unstack(x=logits, axis=0)

batch_ones = fluid.layers.fill_constant_batch_size_like(
    input=start_logits, dtype='int64', shape=[1], value=1)
num_seqs = fluid.layers.reduce_sum(input=batch_ones)

def compute_loss(logits, positions):
    loss = fluid.layers.softmax_with_cross_entropy(
        logits=logits, label=positions)
    loss = fluid.layers.mean(x=loss)
    return loss

start_loss = compute_loss(start_logits, start_positions)
end_loss = compute_loss(end_logits, end_positions)
loss = (start_loss + end_loss) / 2.0
if args.use_fp16 and args.loss_scaling > 1.0:
    loss *= args.loss_scaling

graph_vars = {
    "loss": loss,
    "num_seqs": num_seqs,
    "unique_id": unique_id,
    "start_logits": start_logits,
    "end_logits": end_logits
}

for k, v in graph_vars.items():
    v.persistable = True

return pyreader, graph_vars
1973sir commented 3 years ago

额,报错内容主要在mrc.py的49行~ 顺便问一下,您用的是静态图还是动态图?

你好,相关信息已经在上面帖子回复了,麻烦尽快回复下哈,我们现在卡在这里没法继续开发了。谢谢!

hutuxian commented 3 years ago

新版本Paddle默认是动态图模式,所以静态图要在代码最前面加上paddle.enable_static()。 您可以试一下看看~

1973sir commented 3 years ago

新版本Paddle默认是动态图模式,所以静态图要在代码最前面加上paddle.enable_static()。 您可以试一下看看~

可以了!菜鸡跪谢!

paddle-bot-old[bot] commented 3 years ago

Are you satisfied with the resolution of your issue?

YES No