gotson / komga

Media server for comics/mangas/BDs/magazines/eBooks with API, OPDS and Kobo Sync support
https://komga.org
MIT License
3.89k stars 232 forks source link

[Bug] Book analysis is slow for webp images #279

Closed ghost closed 3 years ago

ghost commented 4 years ago

Komga environment

Describe the bug

After noticing that a library scan on a fresh installation of Komga had been stuck on counting the number of pages inside my CBZ files for several hours earlier today, I aborted the scan and set up a new installation of Komga for this bug report. I continued by creating a new library containing only a single 10MB/35 pages CBZ file, started a library scan and observed that it took Komga almost two minutes to complete the "AnalyzeBook" task. The thumbnail generation task, meanwhile, finished in just a couple of seconds.

Steps to reproduce

  1. Create a fresh installation of Komga v0.57.0.
  2. Download this .zip file, rename it to .cbz and create a new folder for it: Locke & Key - Welcome to Lovecraft v01.zip
  3. Create a new library containing only this CBZ file.
  4. Wait a few minutes for Komga to complete the library scan.
  5. Check the log files.

Expected behavior

Komga only takes a fraction of a second to finish counting the number of pages inside the CBZ file.

Actual behavior

Komga takes almost two minutes to finish counting the number of pages inside the CBZ file. While testing different CBZ files in my collection, I observed that this time increases linearly with the number of pages. For example, it took Komga almost 20 minutes to finish scanning a 70MB/221 pages CBZ file.

Additional context

I think this issue is at least tangentially related to #278. I've been using Komga with the same media collection on the same hardware since about December 2019 and never had any issues with these files. In older versions, it only took Komga a fraction of a second each to finish the "AnalyzeBook" task for the CBZ files in my collection.

Edit: It looks like this issue does not affect PDF files. Komga only takes 270ms to finish scanning a 50+ pages PDF file for me.

Log file

I've noticed that Komga prints the error message ERROR 30813 --- [DefaultMessageListenerContainer-1] unknown.jul.logger: TODO exactly once for each corresponding page in the CBZ file, for a total of 35 times.

Spoiler

``` 2020-08-20 15:03:37.867 INFO 30813 --- [http-nio-7264-exec-7] o.g.k.domain.service.LibraryLifecycle : Adding new library: Comics with root folder: file:/mnt/storage/media/literature/Comics/ 2020-08-20 15:03:37.899 INFO 30813 --- [http-nio-7264-exec-7] o.g.k.application.tasks.TaskReceiver : Sending task: ScanLibrary(libraryId=02AV46FH6BCM3) 2020-08-20 15:03:37.995 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: ScanLibrary(libraryId=02AV46FH6BCM3) 2020-08-20 15:03:38.008 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.LibraryScanner : Updating library: Library(name=Comics, root=file:/mnt/storage/media/literature/Comics/, importComicInfoBook=true, importComicInfoSeries=true, importComicInfoCollection=true, importComicInfoReadList=true, importEpubBook=true, importEpubSeries=true, importLocalArtwork=true, scanForceModifiedTime=false, scanDeep=false, id=02AV46FH6BCM3, createdDate=2020-08-20T15:03:37, lastModifiedDate=2020-08-20T15:03:37) 2020-08-20 15:03:38.009 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner : Scanning folder: /mnt/storage/media/literature/Comics 2020-08-20 15:03:38.010 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner : Supported extensions: [cbz, zip, cbr, rar, pdf, epub] 2020-08-20 15:03:38.012 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner : Excluded patterns: [#recycle, @eaDir] 2020-08-20 15:03:38.012 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner : Force directory modified time: false 2020-08-20 15:03:38.038 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.FileSystemScanner : Scanned 1 series and 1 books in 20.9ms 2020-08-20 15:03:38.044 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.LibraryScanner : Adding new series: Series(name=Locke & Key, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/, fileLastModified=2020-08-20T15:01:09.700904, id=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38.033306, lastModifiedDate=2020-08-20T15:03:38.033311) 2020-08-20 15:03:38.133 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: RefreshBookMetadata(bookId=02AV46G62B30H) 2020-08-20 15:03:38.139 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.LibraryScanner : Library updated in 130ms 2020-08-20 15:03:38.144 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: AnalyzeBook(bookId=02AV46G62B30H) 2020-08-20 15:03:38.148 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task ScanLibrary(libraryId=02AV46FH6BCM3) executed in 147ms 2020-08-20 15:03:38.159 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: RefreshBookMetadata(bookId=02AV46G62B30H) 2020-08-20 15:03:38.162 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.MetadataLifecycle : Refresh metadata for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:03:38.176 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.i.m.l.LocalArtworkProvider : Looking for local thumbnails for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:03:38.183 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: RefreshSeriesMetadata(seriesId=02AV46G66B9VF) 2020-08-20 15:03:38.186 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task RefreshBookMetadata(bookId=02AV46G62B30H) executed in 26.9ms 2020-08-20 15:03:38.191 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: AnalyzeBook(bookId=02AV46G62B30H) 2020-08-20 15:03:38.195 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookLifecycle : Analyze and persist book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:03:38.196 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookAnalyzer : Trying to analyze book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:03:38.252 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookAnalyzer : Detected media type: application/zip 2020-08-20 15:03:38.581 ERROR 30813 --- [DefaultMessageListenerContainer-1] unknown.jul.logger : TODO [note: this error message is printed exactly 35 times] 2020-08-20 15:05:30.078 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookAnalyzer : Book has 35 pages 2020-08-20 15:05:30.095 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: GenerateBookThumbnail(bookId=02AV46G62B30H) 2020-08-20 15:05:30.102 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: RefreshBookMetadata(bookId=02AV46G62B30H) 2020-08-20 15:05:30.106 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task AnalyzeBook(bookId=02AV46G62B30H) executed in 112s 2020-08-20 15:05:30.122 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: RefreshSeriesMetadata(seriesId=02AV46G66B9VF) 2020-08-20 15:05:30.124 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.MetadataLifecycle : Refresh metadata for series: Series(name=Locke & Key, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/, fileLastModified=2020-08-20T15:01:09.700, id=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38) 2020-08-20 15:05:30.171 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.i.m.l.LocalArtworkProvider : Looking for local thumbnails for series: Series(name=Locke & Key, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/, fileLastModified=2020-08-20T15:01:09.700, id=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38) 2020-08-20 15:05:30.176 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task RefreshSeriesMetadata(seriesId=02AV46G66B9VF) executed in 53.8ms 2020-08-20 15:05:30.180 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: GenerateBookThumbnail(bookId=02AV46G62B30H) 2020-08-20 15:05:30.182 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookLifecycle : Generate thumbnail and persist for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:05:30.183 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookAnalyzer : Generate thumbnail for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:05:30.259 ERROR 30813 --- [DefaultMessageListenerContainer-1] unknown.jul.logger : TODO 2020-08-20 15:05:32.613 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookLifecycle : House keeping thumbnails for book: 02AV46G62B30H 2020-08-20 15:05:32.616 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.domain.service.BookLifecycle : Book has bo selected thumbnail, choosing one automatically 2020-08-20 15:05:32.620 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task GenerateBookThumbnail(bookId=02AV46G62B30H) executed in 2.44s 2020-08-20 15:05:32.623 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: RefreshBookMetadata(bookId=02AV46G62B30H) 2020-08-20 15:05:32.625 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.MetadataLifecycle : Refresh metadata for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:05:32.632 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.i.m.l.LocalArtworkProvider : Looking for local thumbnails for book: Book(name=Locke & Key - Welcome to Lovecraft v01, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/Locke%20&%20Key%20-%20Welcome%20to%20Lovecraft%20v01.cbz, fileLastModified=2020-03-23T10:36:42, fileSize=10309532, number=1, id=02AV46G62B30H, seriesId=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38.118) 2020-08-20 15:05:32.632 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.application.tasks.TaskReceiver : Sending task: RefreshSeriesMetadata(seriesId=02AV46G66B9VF) 2020-08-20 15:05:32.633 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task RefreshBookMetadata(bookId=02AV46G62B30H) executed in 9.73ms 2020-08-20 15:05:32.638 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Executing task: RefreshSeriesMetadata(seriesId=02AV46G66B9VF) 2020-08-20 15:05:32.639 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.domain.service.MetadataLifecycle : Refresh metadata for series: Series(name=Locke & Key, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/, fileLastModified=2020-08-20T15:01:09.700, id=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38) 2020-08-20 15:05:32.663 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.k.i.m.l.LocalArtworkProvider : Looking for local thumbnails for series: Series(name=Locke & Key, url=file:/mnt/storage/media/literature/Comics/Locke%20&%20Key/, fileLastModified=2020-08-20T15:01:09.700, id=02AV46G66B9VF, libraryId=02AV46FH6BCM3, createdDate=2020-08-20T15:03:38, lastModifiedDate=2020-08-20T15:03:38) 2020-08-20 15:05:32.664 INFO 30813 --- [DefaultMessageListenerContainer-1] o.g.komga.application.tasks.TaskHandler : Task RefreshSeriesMetadata(seriesId=02AV46G66B9VF) executed in 25.6ms ```
gotson commented 4 years ago

Komga doesn't count pages during the analysis, it does much more than that.

The analysis for pdf is faster, because the pages are always rendered in JPEG at the same dimensions, so the analysis doesn't need to analyze each page to detect the media type and size of each page.

gotson commented 4 years ago

The message ERROR 30813 --- [DefaultMessageListenerContainer-1] unknown.jul.logger: TODO is just a log from the webp library.

The webp library used in Komga is quite beta and not very fast, which would explain why your book is long to analyze.

Analysis time will also depend on your hardware (cpu) and disk or network access speed to access the books.

ghost commented 4 years ago

@gotson It only took Komga a fraction of the time to scan the identical files in previous versions, though. Just for reference, this is a colocated server with NVMe SSDs, two 12c/24t CPUs and 128GB of RAM, so I don't think a lack of processing power is causing Komga to take 20 minutes to scan a single file.

gotson commented 4 years ago

It's doing more in recent versions than it did before.

Where are your files located and how are they accessed?

ghost commented 4 years ago

@gotson The files are stored on the same server; they're not being accessed via NFS or another network file system. I run Komga as a regular systemd service, so it's accessing the files like any other application would.

Are you able to reproduce the issue with my test file? Does it also take several minutes for Komga to finish scanning it on your machine?

I have about 2,200 files in my collection, most of which are tankobons. Based on my tests, it takes Komga around three seconds to process one page. Assuming an average page count of 100 pages, that would work out to an initial scan time of 7 days, 15 hours and 20 minutes, not including thumbnail generation.

Even taking the fact that the "AnalyzeBook" task does more than it did in previous versions into account, that figure seems extremely high to me. Again, just for reference, the initial library scan in previous versions only took a few hours to finish and that's including the thumbnail generation process (which accounted for the vast majority of that time).

gotson commented 4 years ago

I'm on my phone so cannot test now. A cbz with 100 pages is analyzed in 5s on my system.

gotson commented 4 years ago

I managed ti reproduce, it only happens with webp images. The latest versions of Komga perform an image analysis on each image to get the dimensions. This takes around 3s per image on my system for a webp.

There are no other webp libraries for Java. There's an open issue at twelve monkeys (a java image library) to have a proper one, but I discussed with the author and the performance would be most likely similar.

Another solution would be to use native libraries, and to that end I did submit a PR some time ago on a project to properly package native libraries for multiple architectures, but it's still pending (https://github.com/sejda-pdf/webp-imageio/pull/6).

To come back to your issue, it's only gonna impact webp files. I don't know how many you have, I found them to be pretty rare in the western comics and European bds, and only saw a few for mangas.

ghost commented 4 years ago

@gotson Thanks for the explanation. I sadly converted my entire library to WEBP months ago due to the significant space savings and improved page loading times compared to JPEG. Oh well, at least I have confirmation that this is expected behavior now. 😅

gotson commented 4 years ago

It is really not wise to convert from a lossy format like JPEG to another format. You should only ever convert from lossless to lossy, never from lossy to anything else!

ghost commented 4 years ago

@gotson I would agree if we were talking about music, but lossless sources for almost all other types of media simply don't exist. I personally think the almost imperceptible quality loss is well worth the significant space and bandwidth savings.

gotson commented 4 years ago

I think you are wrong, but you do what you want.

ghost commented 4 years ago

@gotson I didn't say that lossless image formats don't exist, I said that lossless sources don't exist for almost any type of media other than music. A lot of people rip CDs directly to FLAC files and release those on the Internet, but not even official digital manga publishers like Viz release chapters or volumes in a lossless image format.

gotson commented 4 years ago

A scanner gives you a RAW. But that's not the debate here.

blkjack410 commented 4 years ago

I just encountered this recently as well where Webp pages takes much longer to process. If the library is the issue, would it be possible to brute force the problem with really crude multi-threading? I have no java/kotlin experience but maybe you can giving each page to a different core or even different books for each core? My unraid server has 16 cores but there's only one core struggling with the webp.

This may be more trouble than it's worth but it could help for people with a lot of webp books if it's doable.

gotson commented 4 years ago

The current library is the issue, but i plan to switch to a native library once https://github.com/sejda-pdf/webp-imageio/pull/6 is merged. The current version doesn't support all the architectures that Komga supports.

I just did some tests with a 236 pages book with webp images, the native library performs the analysis in 4s.

mihailim commented 4 years ago

@whalehub @blkjack410 - until the upstream webp-imageio issue is resolved, here are some summary instructions on how to construct a Komga JAR with the fast decoder library included.

Three massive caveats, though:

To modify the komga JAR:

  1. Fetch the proper decoding lib from https://search.maven.org/ -- to be more specific, search for a:webp-imageio there, i.e. https://search.maven.org/search?q=a:webp-imageio . In this case, it's currently webp-imageio-0.1.6.jar . Click the download icon on the right (kind of hard to see.)
  2. Fetch the komga JAR into a new directory.
  3. Unzip the komga JAR.
  4. Copy the decoding lib JAR you fetched in step 1 to BOOT-INF/lib/
  5. Remove the file BOOT-INF/lib/webp-imageio-decoder-plugin-0.2.jar
  6. Edit BOOT-INF/classpath.idx with the text editor of your choice, search for "webp-imageio", and replace the line - "webp-imageio-decoder-plugin-0.2.jar" with - "webp-imageio-0.1.6.jar" Save it, and remove any backup file your text editor might have created there.
  7. Zip the jar back together. Add the archive members in this exact order: META-INF first, then org and BOOT-INF. Note that you must use the STORE method for everything in the new JAR archive instead of compressing the files, otherwise the JRE will crash with something along the lines of "java.lang.IllegalStateException: Failed to get nested archive for entry BOOT-INF/". For example: zip -Z store -r komga-0.62.5-MODIFIED.jar META-INF org BOOT-INF
  8. Use this JAR instead. Woohoo, speeed.
ghost commented 4 years ago

@mihailim I could kiss you right now. Komga is analyzing my library at breakneck speeds with this build. 😁

@blkjack410 Here's a little Bash script I wrote to automate the build process for new Komga releases.

It is to be executed like this: sudo ./build-komga.sh 0.62.5

#!/bin/bash

VERSION="$1"
BUILD_PATH="/tmp/komga-build"

mkdir "${BUILD_PATH:?}" &&
  cd "${BUILD_PATH:?}" &&
  curl -sSL -o komga-tmp.jar https://github.com/gotson/komga/releases/download/v"${VERSION:?}"/komga-"${VERSION:?}".jar &&
  unzip -q komga-tmp.jar &&
  curl -sSL -o BOOT-INF/lib/webp-imageio-0.1.6.jar https://search.maven.org/remotecontent?filepath=org/sejda/imageio/webp-imageio/0.1.6/webp-imageio-0.1.6.jar &&
  rm BOOT-INF/lib/webp-imageio-decoder-plugin-0.2.jar &&
  sed -i 's|webp-imageio-decoder-plugin-0.2.jar|webp-imageio-0.1.6.jar|g' BOOT-INF/classpath.idx &&
  zip -q -0 -r komga.jar META-INF org BOOT-INF &&
  mv komga.jar /usr/local/bin/komga.jar &&
  rm -rf "${BUILD_PATH:?}"
Kiwi-13-plo commented 4 years ago

@mihailim Thanks soooo much!! It's so fast now and I finally can upgrade komga to the last version.

chelming commented 3 years ago

Any idea on how to do this for the docker container? I tried tossing webp-imageio-0.1.6.jar in app/BOOT-INF/lib, modifying classpath.idx, and restarting the container but it doesn't seem to have made a difference. Perhaps because I wasn't able to remove webp-imageio-decoder-plugin-0.2.jar?

ghost commented 3 years ago

@cwhits Since containers are ephemeral by nature, all changes that are made inside of them while they are running are discarded when they are restarted. To make use of the faster WEBP library, you have to either build a custom Docker image or map the modified JAR file from the host to the container using Docker's volume feature.

chelming commented 3 years ago

@whalehub

@cwhits Since containers are ephemeral by nature, all changes that are made inside of them while they are running are discarded when they are restarted.

that's only true for recreating a container, not restarting a container. regardless of how many times I restart the container, any changes I made will remain until I recreate the container.

it also doesn't appear that the jar exists inside the container, or, if it does it's the unzipped version of the jar. I don't know anything about Java so I'm at a loss there. I guess I could map BOOT-INF, META-INF, and org as volumes but that seems a lil crazy.

chelming commented 3 years ago

well it's gross but it worked. unzipped the jar (added/removed the webp libraries, edited the classpath) and mounted it to /app. 🤷‍♀️

gotson commented 3 years ago

For docker you can leverage on bind mounts to only change a single file.

In your docker-compose.yml add the following in your volumes:

- type: bind
  source: /path/to/this/webp-imageio-0.1.6.jar
  target: /app/BOOT-INF/lib/webp-imageio-decoder-plugin-0.2.jar
  read_only: true

Recreate the container. That's it!

ghost commented 3 years ago

@blkjack410 @mihailim @Kiwi-13-plo @cwhits Just wanted to share a discovery I made earlier today with my fellow WEBP adopters. I always took it for granted that WEBP would be universally better at compressing manga/comics compared to the alternatives, but it turns out that it depends largely on the encoder you use.

I've been using WEBP with ImageMagick @ 80% for my conversions up until now and tried out the JPG encoder MozJPEG @ 80% today, only to discover that the output images produced by the latter are not only smaller, but also either identical or higher in quality compared to their WEBP counterparts.

YMMV, but check out the results of a test I did using a very detailed and high quality page from Berserk.

JPG (Original): 7.59MB WEBP (ImageMagick): 4.65MB JPG (MozJPEG): 3.78MB

Original vs WEBP Original vs MozJPEG

I'm probably gonna redownload all my stuff and encode it with MozJPEG this time around because once JPEG XL becomes standardized and widely supported, it will be possible to compress these images a further 20% or so "while allowing the original JPEG to be recovered byte-by-byte" (see brunsli).

mihailim commented 3 years ago

That heavily depends on the input file, @whalehub -- I've run tests over hundreds of heterogenous images using combinations between MozJPEG, jpegoptim, jpeg-recompress from jpeg-archive (my preferred JPEG tool, which internally uses MozJPEG and manages to keep closer to the original while usually producing smaller images when using MS-SSIM or SmallFry), libwebp etc. Sometimes one wins over the other, but it's hard to compare apples to apples, since the quality metric does not necessarily mean the same thing for all of them. In my experience, webp@q80 still beats jpeg-recompress@q80 by 18-25% overall in file size while retaining better color fidelity.

That being said, it still makes the best sense to keep around the original files as long as enough space is available. I use webp for remote disaster recovery backups, where using webp versus jpg means I can cut my storage costs in half; I wouldn't even consider going a lossy-to-lossy re-encoding route if I didn't archive the original files.

gotson commented 3 years ago

Since the author of https://github.com/sejda-pdf/webp-imageio did not reply to my PR, I have forked the repo and published my own version onto JCenter.

Komga will use the native library if possible, and fallback on the java implementation (slower) if the native library is not available or cannot be loaded.

Currently the following OS/Arch are supported:

gotson commented 3 years ago

:tada: This issue has been resolved in version 0.64.2 :tada:

The release is available on:

Your semantic-release bot :package::rocket: