VoxelCubes / PanelCleaner

An AI-powered tool to clean manga panels.
GNU General Public License v3.0
202 stars 16 forks source link

Some Text Still Remains #99

Closed MrFanservice-png closed 1 month ago

MrFanservice-png commented 1 month ago

I use the tool to remove all text from manga pages, but some of them still remains. I simply dragged and dropped the images and selected start for cleaning, and it should have remove all of them. How do I fix that?

2024-07-04 05:17:53.607 | INFO | pcleaner.gui.launcher:launch:100 -

2024-07-04 05:17:53.622 | INFO | pcleaner.gui.launcher:launch:114 - Using locale en_US. 2024-07-04 05:17:53.624 | DEBUG | pcleaner.gui.launcher:launch:121 - Loaded built-in Qt translations for en_US. 2024-07-04 05:17:53.624 | DEBUG | pcleaner.gui.launcher:launch:129 - Loaded built-in Qt base translations for en_US. 2024-07-04 05:18:00.281 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base 2024-07-04 05:18:18.404 | INFO | manga_ocr.ocr:init:25 - Using CPU 2024-07-04 05:18:19.074 | INFO | manga_ocr.ocr:init:32 - OCR ready 2024-07-04 05:18:19.075 | INFO | pcleaner.gui.model_downloader_driver:check_finished:125 - Finished downloading all models. 2024-07-04 05:18:19.090 | DEBUG | pcleaner.gui.mainwindow_driver:initialize_ui:210 - Purging missing profiles. 2024-07-04 05:18:19.090 | INFO | pcleaner.gui.mainwindow_driver:initialize_profiles:863 - Found profiles: [('Default', None)] 2024-07-04 05:18:19.090 | DEBUG | pcleaner.config:load_profile:1152 - Loading profile None... 2024-07-04 05:18:19.092 | DEBUG | pcleaner.config:load_profile:1159 - Loading builtin default profile 2024-07-04 05:18:19.219 | DEBUG | pcleaner.gui.mainwindow_driver:load_current_profile:1021 - Loading current profile. 2024-07-04 05:18:19.219 | DEBUG | pcleaner.gui.profile_parser:set_profile_values:432 - Setting profile values 2024-07-04 05:18:19.226 | DEBUG | pcleaner.gui.mainwindow_driver:initialize_analytics_view:630 - Loading included font from C:\Users\\Documents\PanelCleaner_internal\pcleaner\data\NotoMono-Regular.ttf 2024-07-04 05:18:19.232 | DEBUG | pcleaner.gui.mainwindow_driver:initialize_analytics_view:633 - Loaded included font 2024-07-04 05:18:19.234 | DEBUG | pcleaner.gui.mainwindow_driver:save_default_palette:129 - Placeholder color: #000000 2024-07-04 05:18:19.242 | INFO | pcleaner.gui.mainwindow_driver:set_theme:148 - Using system theme. 2024-07-04 05:18:19.243 | INFO | pcleaner.gui.mainwindow_driver:changeEvent:198 - Theme is dark: False 2024-07-04 05:18:19.246 | INFO | pcleaner.gui.mainwindow_driver:set_theme:169 - Theme is dark: False 2024-07-04 05:18:19.278 | INFO | pcleaner.gui.mainwindow_driver:changeEvent:198 - Theme is dark: False 2024-07-04 05:18:19.288 | DEBUG | pcleaner.gui.mainwindow_driver:post_init:412 - Char width: 6, columns: 74, required width: 444 2024-07-04 05:18:19.291 | DEBUG | pcleaner.gui.mainwindow_driver:post_init:443 - Splitter sizes: [375, 423, 463] 2024-07-04 05:18:19.291 | DEBUG | pcleaner.gui.mainwindow_driver:start_initialization_worker:561 - Worker Thread cleaning cache 2024-07-04 05:18:19.292 | DEBUG | pcleaner.gui.mainwindow_driver:start_initialization_worker:568 - Worker Thread loading OCR model. 2024-07-04 05:18:19.293 | ERROR | pcleaner.ocr.ocr_tesseract:available_langs:24 - Error checking Tesseract available language data: tesseract is not installed or it's not in your PATH. See README file for more information. 2024-07-04 05:18:19.294 | INFO | pcleaner.ocr.ocr_mangaocr:new:15 - Creating the MangaOcr instance 2024-07-04 05:18:19.294 | INFO | pcleaner.ocr.ocr_mangaocr:new:15 - Creating the MangaOcr instance 2024-07-04 05:18:19.294 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base 2024-07-04 05:18:20.994 | INFO | manga_ocr.ocr:init:25 - Using CPU 2024-07-04 05:18:21.641 | INFO | manga_ocr.ocr:init:32 - OCR ready 2024-07-04 05:18:21.643 | INFO | pcleaner.gui.mainwindow_driver:load_ocr_model:585 - Loaded OCR model (2.35s) 2024-07-04 05:18:37.331 | DEBUG | pcleaner.gui.file_table:handleDrop:197 - Dropped C:/Users//Downloads/Keijo!!!!!!!!/Keijo!!!!!!!! v03 [Akito]/187.png 2024-07-04 05:18:37.332 | DEBUG | pcleaner.gui.file_table:add_file:246 - Requesting to add "C:\Users\\Downloads\Keijo!!!!!!!!\Keijo!!!!!!!! v03 [Akito]\187.png" 2024-07-04 05:18:37.332 | DEBUG | pcleaner.gui.file_table:repopulate_table:303 - Repopulating table 2024-07-04 05:18:37.334 | INFO | pcleaner.gui.file_table:lazy_load_images:619 - Dispatching image loading workers 2024-07-04 05:18:37.335 | DEBUG | pcleaner.gui.file_table:lazy_load_images:625 - Worker Thread loading image C:\Users\\Downloads\Keijo!!!!!!!!\Keijo!!!!!!!! v03 [Akito]\187.png 2024-07-04 05:18:42.637 | INFO | pcleaner.gui.mainwindow_driver:start_cleaning:1317 - Requested outputs: [<Output.denoised_output: 19>, <Output.write_output: 22>] 2024-07-04 05:18:42.638 | INFO | pcleaner.gui.processing:generate_output:176 - Running text detection AI model for 1 images... 2024-07-04 05:18:44.622 | DEBUG | pcleaner.ctd_interface:process_image:195 - Saving json file to C:\Users\\AppData\Roaming\pcleaner\cache\cleaner\8dddcb48-8a25-4913-a15a-5f1e1336f240_187#raw.json 2024-07-04 05:18:44.629 | DEBUG | pcleaner.image_ops:visualize_raw_boxes:894 - Loading included font from C:\Users\\Documents\PanelCleaner_internal\pcleaner\data\LiberationSans-Regular.ttf 2024-07-04 05:18:44.843 | INFO | pcleaner.gui.processing:generate_output:228 - Running preprocessing for 1 images... 2024-07-04 05:18:44.844 | DEBUG | pcleaner.preprocessor:prep_json_file:120 - Processing json file: C:\Users\\AppData\Roaming\pcleaner\cache\cleaner\8dddcb48-8a25-4913-a15a-5f1e1336f240_187#raw.json 2024-07-04 05:18:44.845 | DEBUG | pcleaner.preprocessor:prep_json_file:161 - Detected lang: ja 2024-07-04 05:18:44.864 | INFO | manga_ocr.ocr:init:13 - Loading OCR model from kha-white/manga-ocr-base 2024-07-04 05:18:46.821 | INFO | manga_ocr.ocr:init:25 - Using CPU 2024-07-04 05:18:47.451 | INFO | manga_ocr.ocr:init:32 - OCR ready 2024-07-04 05:18:49.348 | INFO | pcleaner.gui.processing:generate_output:290 - Running masker for 1 images... 2024-07-04 05:18:49.349 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1529 - Showing ocr analytics... 2024-07-04 05:18:51.480 | INFO | pcleaner.gui.processing:generate_output:404 - Running denoiser for 1 images... 2024-07-04 05:18:51.480 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1538 - Showing masker analytics... 2024-07-04 05:18:52.062 | INFO | pcleaner.gui.processing:generate_output:602 - Finished processing 1 images. 2024-07-04 05:18:52.062 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1548 - Showing denoiser analytics... 2024-07-04 05:18:52.098 | DEBUG | pcleaner.image_ops:save_optimized:858 - Saving image 187_clean.png with kwargs: {'optimize': True, 'compress_level': 9, 'dpi': (72.009, 72.009)} 2024-07-04 05:19:27.833 | INFO | pcleaner.gui.mainwindow_driver:output_worker_result:1461 - Output worker finished. 2024-07-04 05:19:28.661 | DEBUG | pcleaner.gui.image_details_driver:init:127 - Opening details tab for C:\Users\\Downloads\Keijo!!!!!!!!\Keijo!!!!!!!! v03 [Akito]\187.png 2024-07-04 05:19:28.675 | DEBUG | pcleaner.gui.image_details_driver:init_sidebar:360 - Setting scroll area width to 281 2024-07-04 05:19:30.520 | DEBUG | pcleaner.gui.image_tab:tab_close:88 - Closing tab at index 1. 2024-07-04 05:19:32.284 | DEBUG | pcleaner.gui.file_table:remove_selected_file:285 - Removing selected file. Auto-selected row: False 2024-07-04 05:19:36.261 | DEBUG | pcleaner.gui.file_table:handleDrop:197 - Dropped C:/Users//Downloads/Keijo!!!!!!!! v01-18 (2013-2017) (Digital SD) (KG Manga)/Keijo!!!!!!!! v03 (2014) (Digital SD) (KG Manga)/page0187.jpeg 2024-07-04 05:19:36.261 | DEBUG | pcleaner.gui.file_table:add_file:246 - Requesting to add "C:\Users\\Downloads\Keijo!!!!!!!! v01-18 (2013-2017) (Digital SD) (KG Manga)\Keijo!!!!!!!! v03 (2014) (Digital SD) (KG Manga)\page0187.jpeg" 2024-07-04 05:19:36.262 | DEBUG | pcleaner.gui.file_table:repopulate_table:303 - Repopulating table 2024-07-04 05:19:36.265 | INFO | pcleaner.gui.file_table:lazy_load_images:619 - Dispatching image loading workers 2024-07-04 05:19:36.265 | DEBUG | pcleaner.gui.file_table:lazy_load_images:625 - Worker Thread loading image C:\Users\\Downloads\Keijo!!!!!!!! v01-18 (2013-2017) (Digital SD) (KG Manga)\Keijo!!!!!!!! v03 (2014) (Digital SD) (KG Manga)\page0187.jpeg 2024-07-04 05:19:41.921 | INFO | pcleaner.gui.mainwindow_driver:start_cleaning:1317 - Requested outputs: [<Output.denoised_output: 19>, <Output.write_output: 22>] 2024-07-04 05:19:41.922 | INFO | pcleaner.gui.processing:generate_output:176 - Running text detection AI model for 1 images... 2024-07-04 05:19:43.466 | DEBUG | pcleaner.ctd_interface:process_image:195 - Saving json file to C:\Users\\AppData\Roaming\pcleaner\cache\cleaner\0084da9d-03b9-43bb-b3bb-c714aa804a28_page0187#raw.json 2024-07-04 05:19:43.472 | DEBUG | pcleaner.image_ops:visualize_raw_boxes:894 - Loading included font from C:\Users\\Documents\PanelCleaner_internal\pcleaner\data\LiberationSans-Regular.ttf 2024-07-04 05:19:43.669 | INFO | pcleaner.gui.processing:generate_output:228 - Running preprocessing for 1 images... 2024-07-04 05:19:43.670 | DEBUG | pcleaner.preprocessor:prep_json_file:120 - Processing json file: C:\Users\\AppData\Roaming\pcleaner\cache\cleaner\0084da9d-03b9-43bb-b3bb-c714aa804a28_page0187#raw.json 2024-07-04 05:19:43.671 | DEBUG | pcleaner.preprocessor:prep_json_file:161 - Detected lang: ja 2024-07-04 05:19:44.123 | INFO | pcleaner.gui.processing:generate_output:290 - Running masker for 1 images... 2024-07-04 05:19:44.123 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1529 - Showing ocr analytics... 2024-07-04 05:19:44.555 | INFO | pcleaner.gui.processing:generate_output:404 - Running denoiser for 1 images... 2024-07-04 05:19:44.555 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1538 - Showing masker analytics... 2024-07-04 05:19:45.148 | INFO | pcleaner.gui.processing:generate_output:602 - Finished processing 1 images. 2024-07-04 05:19:45.148 | INFO | pcleaner.gui.mainwindow_driver:show_current_progress:1548 - Showing denoiser analytics... 2024-07-04 05:19:45.166 | DEBUG | pcleaner.image_ops:save_optimized:858 - Saving image page0187_clean.jpeg with kwargs: {'optimize': True, 'quality': 95, 'progressive': True, 'dpi': (72, 72)} 2024-07-04 05:19:47.056 | INFO | pcleaner.gui.mainwindow_driver:output_worker_result:1461 - Output worker finished. 2024-07-04 05:25:00.590 | DEBUG | pcleaner.gui.mainwindow_driver:open_issue_reporter:760 - Opening issue reporter. 2024-07-04 05:25:41.072 | DEBUG | pcleaner.gui.gui_utils:open_file:115 - Opening file C:\Users\\AppData\Roaming\pcleaner\pcleanerconfig.ini 2024-07-04 05:25:45.651 | DEBUG | pcleaner.gui.mainwindow_driver:open_issue_reporter:760 - Opening issue reporter.

page0187_clean

187_clean

VoxelCubes commented 1 month ago

Panel Cleaner doesn't clean everything. It cleans as best it can, however well the machine learning models perform. These aren't perfect and sometimes make mistakes, so Panel Cleaner makes sure it doesn't clean up things that aren't text, erring on the conservative side. The text in that lower left bubble is a bit too close to the other bubble. Look at this debug output from the details section that you get when clicking on one of the images you dragged into Panel Cleaner:

2024-07-04_17-53

Those are the masks it attempted to use, but even the smallest one still overlaps with the other bubble, so instead of taking a chunk out of that other bubble's border, it just skips it, leaving that for a human who has more precision. If instead just a little bite were taken out of the other bubble, that would probably be harder to notice and fix, so Panel Cleaner doesn't do that by default.

If you look at the profile settings, mainly in the masker section, you can tweak the mask max standard deviation, which is the maximum overlap at which it gives up at. Just set it really high if you'd rather have more cleaning, but potentially deal with little chunks taken out of nearby sections. Be sure to hit apply for the profile before doing the cleaning again.

You can tweak other settings too, save the profile, and even make that the default. Hope that helps!