PaddlePaddle / Paddle

PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
http://www.paddlepaddle.org/
Apache License 2.0
22.26k stars 5.6k forks source link

PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutable_data firstly. #56193

Open WenWuZhiDao opened 1 year ago

WenWuZhiDao commented 1 year ago

bug描述 Describe the Bug

File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/data_loader.py", line 383, in path_to_doc1 res = file_to_doc(file, base_path=None, verbose=verbose, fail_any_exception=fail_any_exception, File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/data_loader.py", line 305, in file_to_doc doc1 = UnstructuredPaddlePDFLoader(file_path=file).load() File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/langchain/document_loaders/unstructured.py", line 71, in load elements = self._get_elements() File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/pdf_loader.py", line 277, in _get_elements txt_file_path = pdf_ocr_txt(self.file_path) File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/pdf_loader.py", line 257, in pdf_ocr_txt result = self.ocr.ocr(img_name) File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/paddleocr.py", line 645, in ocr dt_boxes, recres, = self.call(img, cls) File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/tools/infer/predict_system.py", line 89, in call img_crop_list, angle_list, elapse = self.text_classifier( File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/tools/infer/predict_cls.py", line 112, in call prob_out = self.output_tensors[0].copy_to_cpu() RuntimeError:


C++ Traceback (most recent call last):

0 void paddle_infer::Tensor::CopyToCpuImpl(float, void, void ()(void), void) const 1 float phi::DenseTensor::data() 2 phi::DenseTensor::data() 3 phi::DenseTensor::check_memory_size() const 4 phi::enforce::EnforceNotMet::EnforceNotMet(phi::ErrorSummary const&, char const*, int) 5 phi::enforce::GetCurrentTraceBackStringabi:cxx11


Error Message Summary:

PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutabledata firstly. [Hint: holder should not be null.] (at ../paddle/phi/core/dense_tensor_impl.cc:44)

其他补充信息 Additional Supplementary Information

padd 版本: python3 -m pip install paddlepaddle-gpu==2.5.1.post116 -f https://www.paddlepaddle.org.cn/whl/linux/cudnnin/stable.html

NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6

w5688414 commented 1 year ago

请问有代码没?提供一下代码

WenWuZhiDao commented 1 year ago

请问有代码没?提供一下代码

elif pdf_class_name == 'UnstructuredPaddlePDFLoader': doc1 = UnstructuredPaddlePDFLoader(file_path=file).load() handled |= len(doc1) > 0

remove empty documents

        doc1 = [x for x in doc1 if x.page_content]
zjmwqx commented 1 year ago

同样的问题,在调用 ocr_model_sys.ocr时有些图片会出现这个问题: 1c253d833ca9babad52021ef34fd9cf8

codenoid commented 1 year ago

is this has been solve?

sicklife commented 11 months ago

+1

yangyuke001 commented 11 months ago

+1

div2410 commented 10 months ago

+1

wkk689wkk689 commented 10 months ago

+1

ShubhamZoop commented 8 months ago

https://github.com/PaddlePaddle/Paddle/issues/22614

This issue was opened earlier in 2020, and still exist. Any solution?

technoxiehu commented 8 months ago

进行大批量paddleocr预测的时候,我也碰到了这个buger,非常频繁。 Snipaste_2024-03-08_11-14-45 (PreconditionNotMet) Tensor holds no memory. Call Tensor::mutabledata firstly.\n [Hint: holder should not be null.] (at ..\\paddle\\phi\\core\\dense_tensor_impl.cc:44)\n",\n "suc": false\n}

QuStarry commented 5 months ago

+1

MingsYang commented 5 months ago

+1+1+1+1+1

zheng0116 commented 3 months ago

I have the same problem, but I encountered it when using multiple threads. I think it was caused by resource leaks, so I used "lock = threading.Lock()" solve this problem. You can try using "with lock" under the model loading function. :", if you using async, you can try "asyncio.Lock()"

sanket-valani commented 1 month ago

I have the same problem, but I encountered it when using multiple threads. I think it was caused by resource leaks, so I used "lock = threading.Lock()" solve this problem. You can try using "with lock" under the model loading function. :", if you using async, you can try "asyncio.Lock()"

ocr = PaddleOCR(lang='en', use_gpu=True)
ocr_lock = threading.Lock()

ocr_lock.acquire()
try:
    ocr_dump  = ocr.ocr(npArray)  
finally:
    ocr_lock.release()

if use-case is to handle multiple ocr calls simultaneously, then just create multiple ocr objects and allow only a given number of threads to do ocr at a time simultaneously.


ocr1 = PaddleOCR(lang='en', use_gpu=True)
ocr2 = PaddleOCR(lang='en', use_gpu=True)
ocr3 = PaddleOCR(lang='en', use_gpu=True)

parallel_thread_counter = 0
parallel_thread_counter_lock = threading.Lock()

ocr_objects = [ocr1, ocr2, ocr3] 
ocr_parallel_count = len(ocr_objects)

ocr_semaphore = threading.Semaphore(value=ocr_parallel_cout)

with ocr_semaphore:
    current_thread_counter = 0
    try:
        parallel_thread_counter_lock.acquire()
        parallel_thread_counter += 1
        current_thread_counter = parallel_thread_counter % ocr_parallel_count
    finally:
        parallel_thread_counter_lock.release()

    selected_ocr = ocr_objects[current_thread_counter]
    ocr_dump  = selected_ocr.ocr(npArray)

the above code will essentially take at max 3 concurrent request for processing and do a round robin between the 3 ocr objects.

This is a fix of the original problem, but at least will get things running.. :)