XJF2332 / GOT-OCR-2-GUI

GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能
Apache License 2.0
118 stars 12 forks source link

处理PDF时Render.render(model, tokenizer, image_path, convert_confirm)返回值问题 #8

Closed fersity closed 1 month ago

fersity commented 1 month ago

%Run 'pdf2img and Render.py' ['./pdf/孙元琳-2024.pdf'] Loading config... 正在导入库...... 正在加载模型...... 模型加载成功 是否将HTML转换为PDF?(Y/N)y The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:None for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. The seen_tokens attribute is deprecated and will be removed in v4.41. Use the cache_position model input instead. G:\programs2023\python311\Lib\site-packages\transformers\models\qwen2\modeling_qwen2.py:623: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\transformers\cuda\sdp_utils.cpp:555.) attn_output = torch.nn.functional.scaled_dot_product_attention( ==============rendering=============== 字符串 '(C)' 已被替换为 '(C\)'。 Traceback (most recent call last): File "G:\BaiduNetdiskDownload\GOT-OCR-2-GUI\pdf2img and Render.py", line 87, in success,res = Render.render(model, tokenizer, image_path, convert_confirm) TypeError: cannot unpack non-iterable bool object

备注: windows 11 NVIDIA GeForce MX150 python 3.11 cuda 11.8 torch 2.4.1+cu118

XJF2332 commented 1 month ago

新提交应该修复了这个问题。 不过要注意的是,现在这个处理pdf的功能还比较原始,具体来说:

  1. 开始前请保证你的result、imgs文件夹是空的,pdf文件夹只有一个你需要处理的pdf,不然的话这个脚本会一次性处理完全部的输入,而不是你想要的单个文件
  2. 原始pdf的每一页都会变成单个pdf(或者HTML)保存下来,暂时还没有合并的功能,输出会比较乱

所以说,用着的时候别玩脱了,渲染器这玩意我还有很多事要做

fersity commented 1 month ago

好的,谢谢