bajuwa / ComicCompiler

Scripts that help combine/splice webtoon images into a smaller collection of pages
GNU General Public License v3.0
15 stars 5 forks source link

Improve the breakpoint search time for the Mode 1 algorithm #8

Open bajuwa opened 4 years ago

bajuwa commented 4 years ago

Problem The current BDM#1 ("Breakpoint Detection Mode #1") uses a brute force O(n) search method when trying to find a breakpoint. This can be slow if the images are long and there's not many opportunities to find a breakpoint.

Resolving this ticket may involve creating a new breakpoint mode instead of altering the existing mode 1.

Testing Conditions

bajuwa commented 4 years ago

Proposed Solution After doing a simple 'does image start with breakpoint', proceed to use a binary search style algorithm that roughly follows these steps when trying to find breakpoints:

  1. Define 'chunk to check' as a text value using the imagemagick sample format "[WxH+WO+HO]"
  2. Retain a FILO list of 'chunks to check'
  3. Retain a single 'current chunk to check'
  4. Perform a 'vertical breakpoint detection' call on the whole image as a single chunk a. If the chunk contains a potential breakpoint, split the chunk in half, keep the first chunk as 'current chunk to check' and add the other to the 'chunks to check' list b. If the chunk does not contain a potential breakpoint, pop the last item (ie most recently added) off the 'chunks to check' list as the new 'current chunk to check'
  5. Repeat step 3 until either: a. The most recently checked chunk is $[Break Point Row Check Increments] pixels high or less. Check the top and middle rows of the chunk for explicit horizontal breakpoint. Skip checking bottom because that's essentially the same as checking the top of the next chunk. If check succeeds, return row number. If they all fail, treat the same as 3a b. both the 'current chunk to check' and the list are empty; in this case no breakpoint was found in the image

Variable can be removed: $BREAK_POINT_ROW_CHECK_BATCH_MULTIPLIER

I'm an idiot. This will provide a O(n^2) worst case run time instead of the existing O(n), and the likely frequency of the worst case happening is roughly equal between brute force and binary.

New proposed solution: just add 1 vertical full-height check for whitespace before proceeding with the existing brute force algorithm

bajuwa commented 4 years ago

That still didn't really help with the problematic tests/breakpoint-detection-mode/ test case. Deferring this ticket until later.