PaddlePaddle / PaddleNLP

👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Question Answering, ℹ️ Information Extraction, 📄 Document Intelligence, 💌 Sentiment Analysis etc.
https://paddlenlp.readthedocs.io
Apache License 2.0
12.17k stars 2.95k forks source link

[Question]: UIE模型偶尔出现AttributeError: 'paddle.fluid.core_avx.Tensor' object has no attribute 'numpy' #3802

Closed menghonghan closed 1 year ago

menghonghan commented 2 years ago

请提出你的问题

image paddle==2.3.2 paddlenlp==2.4.1 python ==3.7

image

这个问题是偶尔出现的,input_id在运行成功的情况下是paddle.Tensor 报错时类型为paddle.fluid.core_avx.Tensor ,请问如何解决这个问题,以及发生的原因是什么,谢谢。

wawltor commented 2 years ago

是否出现了Taskflow和其他程序混用的情况了? 或者提供一下代码?

jerrylsu commented 2 years ago

请提出你的问题

image paddle==2.3.2 paddlenlp==2.4.1 python ==3.7

image

这个问题是偶尔出现的,input_id在运行成功的情况下是paddle.Tensor 报错时类型为paddle.fluid.core_avx.Tensor ,请问如何解决这个问题,以及发生的原因是什么,谢谢。

UIE的推理是线程不安全的,检查一下是否多线程推理了

linjieccc commented 2 years ago

试试在Taskflow调用时加上paddle.disable_static()

menghonghan commented 2 years ago

调用方式: ie = Taskflow('information_extraction', schema=schema, task_path='/UIE',user_dict=path,device_id=-1) 直接调用没有问题,报错时是用flask多线程调的,有什么改进办法吗谢谢

menghonghan commented 2 years ago

paddle.disable_static()

你好,我这个实际应用要求效率,转动态图之后会慢很多,有没有什么静态下的解决方案呢谢谢

linjieccc commented 2 years ago

调用方式: ie = Taskflow('information_extraction', schema=schema, task_path='/UIE',user_dict=path,device_id=-1) 直接调用没有问题,报错时是用flask多线程调的,有什么改进办法吗谢谢

Hi, @menghonghan Taskflow不是线程安全的,flask调用可以参考 #3760 加锁

menghonghan commented 2 years ago

调用方式: ie = Taskflow('information_extraction', schema=schema, task_path='/UIE',user_dict=path,device_id=-1) 直接调用没有问题,报错时是用flask多线程调的,有什么改进办法吗谢谢

Hi, @menghonghan Taskflow不是线程安全的,flask调用可以参考 #3760 加锁

好的谢谢,请问用加锁这种方式之后,还需要paddle.disable_static() 这一步吗

menghonghan commented 1 year ago

调用方式: ie = Taskflow('information_extraction', schema=schema, task_path='/UIE',user_dict=path,device_id=-1) 直接调用没有问题,报错时是用flask多线程调的,有什么改进办法吗谢谢

Hi, @menghonghan Taskflow不是线程安全的,flask调用可以参考 #3760 加锁

你好,我参考#3760后还是会时不时报这个错,代码如下:

import paddle paddle.disable_static() import paddle.fluid as fluid paddle.fluid.enable_dygraph()

import traceback import threading

lock = threading.Lock() @app.route('/api',methods=['GET','POST']) def plat_api(): global ocr out = None if request.method == 'POST': try: json_data = request.get_data() lock.acquire()

        input0 = a_function_contains_UIE(ie)
        lock.release()

        后续处理

    except :
        lock.release()
        return '未提取成功'
menghonghan commented 1 year ago

完整代码

import paddle paddle.disable_static() import paddle.fluid as fluid paddle.fluid.enable_dygraph()

import traceback import threading

ocr = hub.Module(name="chinese_ocr_db_crnn_server", enable_mkldnn=True)

def plat_alonglist_in_api(): global ocr out = None if request.method == 'POST': try: json_data = request.get_data() lock.acquire() myjson = json.loads(json_data.decode("utf-8")) image_id = myjson.get("content")

    except Exception as e:
        print(str(e))
        return "请正确输入"

    pic = image_id.split("/")[-1]
    ……  
    im = cv2.imdecode(np.array(bytearray(response.read()), dtype=np.uint8), cv2.IMREAD_COLOR)

    #调ocr
    results = ocr.recognize_text(images=np_images,
                                 use_gpu=False,
                                 #  output_dir="/data/yuyuechun/DeepLearningSlideCaptcha/flask/app/static/ocr_result",
                                 #  visualization=True,
                                 box_thresh=0.3, text_thresh=0.5)

    try:
        input0 = extract_shtxd(results)    ######其中包含了调UIE的函数!!!!!!
        lock.release()
    except :
        lock.release()
        return '未提取成功'

含有调UIE的部分:

企业微信截图_16697805974928

非常感谢!

linjieccc commented 1 year ago

@menghonghan Hi,

试试把lock.acquire()放到try...的外面,另外可以试试不加ocr,单独抽取文本是否有问题

menghonghan commented 1 year ago

企业微信截图_16704964281643

你好,我在本地直接跑的没用flask部署,还是会报这个错误 AttributeError: 'paddle.fluid.core_avx.Tensor' object has no attribute 'numpy' 请问究竟是什么原因。环境和安装包都是gpu,但是我推理机器用的是cpu或者gpu都会有这种问题。 谢谢,请尽快解答一下。

github-actions[bot] commented 1 year ago

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

github-actions[bot] commented 1 year ago

This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。

art3miz18 commented 3 months ago

is there any resolution for this, i am trying to host the pre-trained model and use Taskflow for IE the inference works first time while loading on GPU memory

the subsequent request throws error

File "/usr/local/lib/python3.10/site-packages/paddlenlp/taskflow/taskflow.py", line 822, in call results = self.task_instance(inputs, kwargs) File "/usr/local/lib/python3.10/site-packages/paddlenlp/taskflow/task.py", line 527, in call outputs = self._run_model(inputs, kwargs) File "/usr/local/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 1068, in _run_model results = self._multi_stage_predict(_inputs) File "/usr/local/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 1166, in _multi_stage_predict result_list = self._single_stage_predict(examples) File "/usr/local/lib/python3.10/site-packages/paddlenlp/taskflow/information_extraction.py", line 975, in _single_stage_predict self.input_handles[0].copy_from_cpu(input_ids.numpy()) AttributeError: 'paddle.base.libpaddle.Tensor' object has no attribute 'numpy'

and i tried threading.lock on global context and infering afterwards still the same @linjieccc