XJF2332 / GOT-OCR-2-GUI

GOT-OCR的GUI版本,提供OCR、导出PDF、批处理等功能,但不提供训练功能
Apache License 2.0
118 stars 12 forks source link

GOT-OCR-2-GUI 安装经验分享(我在自己的另外一台电脑又安装了,希望作者能再完善看看呢) #5

Closed 602387193c closed 1 month ago

602387193c commented 2 months ago

GOT-OCR-2-GUI 安装经验分享

环境配置

  1. 创建并激活 Conda 环境:

    conda create -n gotgui python=3.10
    conda activate gotgui
  2. 进入项目目录:

    cd C:\AI\GOT-OCR-2-GUI
  3. 安装依赖:

    pip install -r requirements-noversion.txt
    conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia

    备注:requirements-noversion.txt这次使用也没有问题,建议作者是否把这个作为默认? conda install pytorch torchvision torchaudio pytorch-cuda=12.4 -c pytorch -c nvidia是根据依赖里面指定的torch==2.4.1+cu124https://pytorch.org/get-started/locally/找到的相应版本。

    下载必要文件(这里根据作者原来的来就行了)

  4. Edge WebDriver

    • 下载压缩包并解压到 edge_driver 文件夹中
    • 确保文件结构如下:
      GOT-OCR-2-GUI
      └─edge_driver
      ├─msedgedriver.exe
      └─...
  5. 模型文件

    • 下载并放置在 models 文件夹中
    • 确保文件结构如下:
      GOT-OCR-2-GUI
      └─models
      ├─config.json
      ├─generation_config.json
      ├─got_vision_b.py
      ├─model.safetensors
      ├─modeling_GOT.py
      ├─qwen.tiktoken
      ├─render_tools.py
      ├─special_tokens_map.json
      ├─tokenization_qwen.py
      └─tokenizer_config.json

运行程序

执行以下命令:

python GUI.py

遇到 OpenMP 运行时相关错误,可尝试以下解决方案:

  1. 错误信息示例:

    OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
    OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
  2. 解决方法: 在运行程序前,设置环境变量:

    $env:KMP_DUPLICATE_LIB_OK = "TRUE"
  3. 然后重新运行程序:

    python GUI.py

按照以上步骤操作,您应该能够成功安装并运行 GOT-OCR-2-GUI。如遇其他问题,请参考项目文档或寻求进一步帮助。

XJF2332 commented 2 months ago

写的很详细,pin了!至于改默认依赖的事情,我现在没有别的设备可以用来确定问题,所以先把torch从requirements里独立出来看看,要是还能碰到新的issue的话就改默认吧👍🏻👍🏻👍🏻

602387193c commented 2 months ago

嗯嗯,其实这些东西,我也是自己乱写,然后交给Claude来补充、整理下就行了,可以省很多事情。

vank3f3 commented 6 days ago

请问这个项目,最低的显卡配置是多少呢。

XJF2332 commented 3 days ago

空闲1.4GB,加载模型4.xGB,跑OCR7.3GB,我刚刚拿普通ocr测的