zyddnys / manga-image-translator

Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/
https://cotrans.touhou.ai/
GNU General Public License v3.0
5.43k stars 558 forks source link

[Bug]: Tried using config but seems I can't change the Translator from sugoi to groq #761

Open BlaskKilt opened 7 hours ago

BlaskKilt commented 7 hours ago

Issue

I'm not good at this config, so I made a minor edit and tried to run it. However, I'm having a hard time using Groq. Can someone correct me on how to do this? ctd and mocr seems not running also or is it just me I'm kinda noob at this

Command Line Arguments

{
  "$defs": {
    "Alignment": {
      "enum": [
        "auto",
        "left",
        "center",
        "right"
      ],
      "title": "Alignment",
      "type": "string"
    },
    "Colorizer": {
      "enum": [
        "none",
        "mc2"
      ],
      "title": "Colorizer",
      "type": "string"
    },
    "ColorizerConfig": {
      "properties": {
        "colorization_size": {
          "default": 576,
          "title": "Colorization Size",
          "type": "integer"
        },
        "denoise_sigma": {
          "default": 30,
          "title": "Denoise Sigma",
          "type": "integer"
        },
        "colorizer": {
          "$ref": "#/$defs/Colorizer",
          "default": "none"
        }
      },
      "title": "ColorizerConfig",
      "type": "object"
    },
    "Detector": {
      "enum": [
        "default",
        "dbconvnext",
        "ctd",
        "craft",
        "none"
      ],
      "title": "Detector",
      "type": "string"
    },
    "DetectorConfig": {
      "properties": {
        "detector": {
          "$ref": "#/$defs/Detector",
          "default": "ctd"
        },
        "detection_size": {
          "default": 1536,
          "title": "Detection Size",
          "type": "integer"
        },
        "text_threshold": {
          "default": 0.5,
          "title": "Text Threshold",
          "type": "number"
        },
        "det_rotate": {
          "default": false,
          "title": "Det Rotate",
          "type": "boolean"
        },
        "det_auto_rotate": {
          "default": false,
          "title": "Det Auto Rotate",
          "type": "boolean"
        },
        "det_invert": {
          "default": false,
          "title": "Det Invert",
          "type": "boolean"
        },
        "det_gamma_correct": {
          "default": false,
          "title": "Det Gamma Correct",
          "type": "boolean"
        },
        "box_threshold": {
          "default": 0.7,
          "title": "Box Threshold",
          "type": "number"
        },
        "unclip_ratio": {
          "default": 2.3,
          "title": "Unclip Ratio",
          "type": "number"
        }
      },
      "title": "DetectorConfig",
      "type": "object"
    },
    "Direction": {
      "enum": [
        "auto",
        "horizontal",
        "vertical"
      ],
      "title": "Direction",
      "type": "string"
    },
    "InpaintPrecision": {
      "enum": [
        "fp32",
        "fp16",
        "bf16"
      ],
      "title": "InpaintPrecision",
      "type": "string"
    },
    "Inpainter": {
      "enum": [
        "default",
        "lama_large",
        "lama_mpe",
        "sd",
        "none",
        "original"
      ],
      "title": "Inpainter",
      "type": "string"
    },
    "InpainterConfig": {
      "properties": {
        "inpainter": {
          "$ref": "#/$defs/Inpainter",
          "default": "lama_large"
        },
        "inpainting_size": {
          "default": 2048,
          "title": "Inpainting Size",
          "type": "integer"
        },
        "inpainting_precision": {
          "$ref": "#/$defs/InpaintPrecision",
          "default": "bf16"
        }
      },
      "title": "InpainterConfig",
      "type": "object"
    },
    "Ocr": {
      "enum": [
        "32px",
        "48px",
        "48px_ctc",
        "mocr"
      ],
      "title": "Ocr",
      "type": "string"
    },
    "OcrConfig": {
      "properties": {
        "use_mocr_merge": {
          "default": false,
          "title": "Use Mocr Merge",
          "type": "boolean"
        },
        "ocr": {
          "$ref": "#/$defs/Ocr",
          "default": "mocr"
        },
        "min_text_length": {
          "default": 0,
          "title": "Min Text Length",
          "type": "integer"
        },
        "ignore_bubble": {
          "default": 0,
          "title": "Ignore Bubble",
          "type": "integer"
        }
      },
      "title": "OcrConfig",
      "type": "object"
    },
    "RenderConfig": {
      "properties": {
        "renderer": {
          "$ref": "#/$defs/Renderer",
          "default": "default"
        },
        "alignment": {
          "$ref": "#/$defs/Alignment",
          "default": "auto"
        },
        "disable_font_border": {
          "default": false,
          "title": "Disable Font Border",
          "type": "boolean"
        },
        "font_size_offset": {
          "default": 0,
          "title": "Font Size Offset",
          "type": "integer"
        },
        "font_size_minimum": {
          "default": 16,
          "title": "Font Size Minimum",
          "type": "integer"
        },
        "direction": {
          "$ref": "#/$defs/Direction",
          "default": "auto"
        },
        "uppercase": {
          "default": false,
          "title": "Uppercase",
          "type": "boolean"
        },
        "lowercase": {
          "default": false,
          "title": "Lowercase",
          "type": "boolean"
        },
        "gimp_font": {
          "default": "Sans-serif",
          "title": "Gimp Font",
          "type": "string"
        },
        "no_hyphenation": {
          "default": false,
          "title": "No Hyphenation",
          "type": "boolean"
        },
        "font_color": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Font Color"
        },
        "line_spacing": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Line Spacing"
        },
        "font_size": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Font Size"
        }
      },
      "title": "RenderConfig",
      "type": "object"
    },
    "Renderer": {
      "enum": [
        "default",
        "manga2eng",
        "none"
      ],
      "title": "Renderer",
      "type": "string"
    },
    "Translator": {
      "enum": [
        "youdao",
        "baidu",
        "deepl",
        "papago",
        "caiyun",
        "gpt3",
        "gpt3.5",
        "gpt4",
        "none",
        "original",
        "sakura",
        "deepseek",
        "groq",
        "offline",
        "nllb",
        "nllb_big",
        "sugoi",
        "jparacrawl",
        "jparacrawl_big",
        "m2m100",
        "m2m100_big",
        "mbart50",
        "qwen2",
        "qwen2_big"
      ],
      "title": "Translator",
      "type": "string"
    },
    "TranslatorConfig": {
      "properties": {
        "translator": {
          "$ref": "#/$defs/Translator",
          "default": "groq"
        },
        "target_lang": {
          "default": "ENG",
          "title": "Target Lang",
          "type": "string"
        },
        "no_text_lang_skip": {
          "default": false,
          "title": "No Text Lang Skip",
          "type": "boolean"
        },
        "skip_lang": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Skip Lang"
        },
        "gpt_config": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Gpt Config"
        },
        "translator_chain": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Translator Chain"
        },
        "selective_translation": {
          "anyOf": [
            {
              "type": "string"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Selective Translation"
        }
      },
      "title": "TranslatorConfig",
      "type": "object"
    },
    "UpscaleConfig": {
      "properties": {
        "upscaler": {
          "$ref": "#/$defs/Upscaler",
          "default": "esrgan"
        },
        "revert_upscaling": {
          "default": false,
          "title": "Revert Upscaling",
          "type": "boolean"
        },
        "upscale_ratio": {
          "anyOf": [
            {
              "type": "integer"
            },
            {
              "type": "null"
            }
          ],
          "default": null,
          "title": "Upscale Ratio"
        }
      },
      "title": "UpscaleConfig",
      "type": "object"
    },
    "Upscaler": {
      "enum": [
        "waifu2x",
        "esrgan",
        "4xultrasharp"
      ],
      "title": "Upscaler",
      "type": "string"
    }
  },
  "properties": {
    "filter_text": {
      "anyOf": [
        {
          "type": "string"
        },
        {
          "type": "null"
        }
      ],
      "default": null,
      "title": "Filter Text"
    },
    "render": {
      "$ref": "#/$defs/RenderConfig",
      "default": {
        "renderer": "default",
        "alignment": "auto",
        "disable_font_border": false,
        "font_size_offset": 0,
        "font_size_minimum": 16,
        "direction": "auto",
        "uppercase": false,
        "lowercase": false,
        "gimp_font": "Sans-serif",
        "no_hyphenation": false,
        "font_color": null,
        "line_spacing": 0.3,
        "font_size": null
      }
    },
    "upscale": {
      "$ref": "#/$defs/UpscaleConfig",
      "default": {
        "upscaler": "esrgan",
        "revert_upscaling": false,
        "upscale_ratio": null
      }
    },
    "translator": {
      "$ref": "#/$defs/TranslatorConfig",
      "default": {
        "translator": "groq",
        "target_lang": "ENG",
        "no_text_lang_skip": false,
        "skip_lang": null,
        "gpt_config": null,
        "translator_chain": null,
        "selective_translation": null
      }
    },
    "detector": {
      "$ref": "#/$defs/DetectorConfig",
      "default": {
        "detector": "default",
        "detection_size": 1536,
        "text_threshold": 0.5,
        "det_rotate": false,
        "det_auto_rotate": false,
        "det_invert": false,
        "det_gamma_correct": false,
        "box_threshold": 0.7,
        "unclip_ratio": 2.3
      }
    },
    "colorizer": {
      "$ref": "#/$defs/ColorizerConfig",
      "default": {
        "colorization_size": 576,
        "denoise_sigma": 30,
        "colorizer": "none"
      }
    },
    "inpainter": {
      "$ref": "#/$defs/InpainterConfig",
      "default": {
        "inpainter": "lama_large",
        "inpainting_size": 2048,
        "inpainting_precision": "bf16"
      }
    },
    "ocr": {
      "$ref": "#/$defs/OcrConfig",
      "default": {
        "use_mocr_merge": false,
        "ocr": "mocr",
        "min_text_length": 0,
        "ignore_bubble": 0
      }
    },
    "kernel_size": {
      "default": 5,
      "title": "Kernel Size",
      "type": "integer"
    },
    "mask_dilation_offset": {
      "default": 5,
      "title": "Mask Dilation Offset",
      "type": "integer"
    }
  },
  "title": "Config",
  "type": "object"
}

Console logs

root@For-src:~# tmux attach-session -t 5
anga/12.1.24/Raw-Zip.Com-Issho_Ni_Ken_No_Shugyo_Wo_Sh_ta_v01-02
[local] Namespace(verbose=True, attempts=0, ignore_errors=True, model_dir=None, use_gpu=False, use_gpu_limited=False, font_path='', pre_dict=None, post_dict=None, kernel_size=3, mode='local', input=['/root/Manga/12.1.24/Raw-Zip.Com-Issho_Ni_Ken_No_Shugyo_Wo_Sh_ta_v01-02'], dest='', format='jpg', overwrite=False, skip_no_text=False, use_mtpe=False, save_text=False, save_text_file='', prep_manual=False, save_quality=100, config_file='config/ctdmocr1536fix.json')
[local] Running in local mode
[local] Loading models
[local] Running text detection
[W1201 02:51:03.317190359 NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.
[DefaultDetector] Detection resolution: 1280x1536
[local] Running ocr
[Model48pxOCR] prob: 0.9896879196166992 アイネ fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.9728177785873413 あ…あぁ fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.838661789894104 昨日の fg: (1, 0, 0) bg: (1, 0, 0)
[Model48pxOCR] prob: 0.9929481744766235 どんな奴? fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.9982208609580994 ねぇ 今回の fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.3977530002593994 )なったのかも fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.805426299571991 昨日の鍛錬が fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.999868631362915 目的の魔物って fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.9999195337295532 意気込んでるな fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.9999734163284302 名前の魔物だね fg: (0, 0, 0) bg: (0, 0, 0)
[Model48pxOCR] prob: 0.9612351059913635 グライ・ベアって fg: (0, 0, 0) bg: (0, 0, 0)
[local] No pre-translation replacements made.
[local] Running text translation
[SugoiTranslator] Translating into English
^C^C[SugoiTranslator] 0: アイネ意気込んでるな => Aine, you're really into this.
[SugoiTranslator] 1: 昨日の鍛錬が => Yesterday's training.
[SugoiTranslator] 2: )なったのかも => ) Maybe it did.
[SugoiTranslator] 3: 昨日の => Yesterday's.
[SugoiTranslator] 4: ねぇ 今回の目的の魔物ってどんな奴? => Hey, what kind of monster are we looking for this time?
[SugoiTranslator] 5: あ…あぁグライ・ベアって名前の魔物だね => Ah... Oh, grai It’s a monster with the name Bear
BlaskKilt commented 4 hours ago

I think the config is not working on local? Also tested the example/config it will say Failed to load

[local] Running in local mode Failed to load configuration file

frederik-uni commented 38 minutes ago

there is no when it fails(don't know why its not logged, but it isn't), but the parser works fine. both toml/json parses. I can't test GROQ, but I assume that it should work based on the error

❯ python -m manga_translator local -i imgs --config-file examples/config-example.json
[local] Running in local mode
[local] Loading models
Please set the GROQ_API_KEY environment variable before using the Groq translator.

I made sure with another translator that selecting works properly. I don't think it needs to be mentioned, but make sure the config file is valid & exists.malformed data will fail. a validator for example would be https://www.jsonschemavalidator.net but there are many more and every text editor has a extension for it too

frederik-uni commented 37 minutes ago

ah I know why. because of a sketchy fix for object has no attribute 'textlines'