Open WenWuZhiDao opened 1 year ago
请问有代码没?提供一下代码
请问有代码没?提供一下代码
elif pdf_class_name == 'UnstructuredPaddlePDFLoader': doc1 = UnstructuredPaddlePDFLoader(file_path=file).load() handled |= len(doc1) > 0
doc1 = [x for x in doc1 if x.page_content]
同样的问题,在调用 ocr_model_sys.ocr时有些图片会出现这个问题:
is this has been solve?
+1
+1
+1
+1
https://github.com/PaddlePaddle/Paddle/issues/22614
This issue was opened earlier in 2020, and still exist. Any solution?
进行大批量paddleocr预测的时候,我也碰到了这个buger,非常频繁。 (PreconditionNotMet) Tensor holds no memory. Call Tensor::mutabledata firstly.\n [Hint: holder should not be null.] (at ..\\paddle\\phi\\core\\dense_tensor_impl.cc:44)\n",\n "suc": false\n}
+1
+1+1+1+1+1
I have the same problem, but I encountered it when using multiple threads. I think it was caused by resource leaks, so I used "lock = threading.Lock()" solve this problem. You can try using "with lock" under the model loading function. :", if you using async, you can try "asyncio.Lock()"
I have the same problem, but I encountered it when using multiple threads. I think it was caused by resource leaks, so I used "lock = threading.Lock()" solve this problem. You can try using "with lock" under the model loading function. :", if you using async, you can try "asyncio.Lock()"
ocr = PaddleOCR(lang='en', use_gpu=True)
ocr_lock = threading.Lock()
ocr_lock.acquire()
try:
ocr_dump = ocr.ocr(npArray)
finally:
ocr_lock.release()
if use-case is to handle multiple ocr calls simultaneously, then just create multiple ocr objects and allow only a given number of threads to do ocr at a time simultaneously.
ocr1 = PaddleOCR(lang='en', use_gpu=True)
ocr2 = PaddleOCR(lang='en', use_gpu=True)
ocr3 = PaddleOCR(lang='en', use_gpu=True)
parallel_thread_counter = 0
parallel_thread_counter_lock = threading.Lock()
ocr_objects = [ocr1, ocr2, ocr3]
ocr_parallel_count = len(ocr_objects)
ocr_semaphore = threading.Semaphore(value=ocr_parallel_cout)
with ocr_semaphore:
current_thread_counter = 0
try:
parallel_thread_counter_lock.acquire()
parallel_thread_counter += 1
current_thread_counter = parallel_thread_counter % ocr_parallel_count
finally:
parallel_thread_counter_lock.release()
selected_ocr = ocr_objects[current_thread_counter]
ocr_dump = selected_ocr.ocr(npArray)
the above code will essentially take at max 3 concurrent request for processing and do a round robin between the 3 ocr objects.
This is a fix of the original problem, but at least will get things running.. :)
bug描述 Describe the Bug
File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/data_loader.py", line 383, in path_to_doc1 res = file_to_doc(file, base_path=None, verbose=verbose, fail_any_exception=fail_any_exception, File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/data_loader.py", line 305, in file_to_doc doc1 = UnstructuredPaddlePDFLoader(file_path=file).load() File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/langchain/document_loaders/unstructured.py", line 71, in load elements = self._get_elements() File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/pdf_loader.py", line 277, in _get_elements txt_file_path = pdf_ocr_txt(self.file_path) File "/data/mlops/Open-Assistant/inference/server/oasst_inference_server/plugins/vectors_db/loaders/pdf_loader.py", line 257, in pdf_ocr_txt result = self.ocr.ocr(img_name) File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/paddleocr.py", line 645, in ocr dt_boxes, recres, = self.call(img, cls) File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/tools/infer/predict_system.py", line 89, in call img_crop_list, angle_list, elapse = self.text_classifier( File "/data/miniconda/envs/oassistant/lib/python3.10/site-packages/paddleocr/tools/infer/predict_cls.py", line 112, in call prob_out = self.output_tensors[0].copy_to_cpu() RuntimeError:
C++ Traceback (most recent call last):
0 void paddle_infer::Tensor::CopyToCpuImpl(float, void, void ()(void), void) const
1 float phi::DenseTensor::data()
2 phi::DenseTensor::data()
3 phi::DenseTensor::check_memory_size() const
4 phi::enforce::EnforceNotMet::EnforceNotMet(phi::ErrorSummary const&, char const*, int)
5 phi::enforce::GetCurrentTraceBackStringabi:cxx11
Error Message Summary:
PreconditionNotMetError: Tensor holds no memory. Call Tensor::mutabledata firstly. [Hint: holder should not be null.] (at ../paddle/phi/core/dense_tensor_impl.cc:44)
其他补充信息 Additional Supplementary Information
padd 版本: python3 -m pip install paddlepaddle-gpu==2.5.1.post116 -f https://www.paddlepaddle.org.cn/whl/linux/cudnnin/stable.html
NVIDIA-SMI 510.60.02 Driver Version: 510.60.02 CUDA Version: 11.6