szeweq / mc-repack

A Minecraft mod repacking tool to optimize size and loading speed of mods.
https://szeweq.xyz/mc-repack
MIT License
9 stars 0 forks source link

Compression Improvements #2

Open solonovamax opened 7 months ago

solonovamax commented 7 months ago

Hi, just found your project and this looks rather interesting.

Here's a couple of improvements/changes that would be amazing if they could be added:

Based on a quick little command I ran, I recorded the frequencies of different file types present in jar files. The command used was

mkdir ./tmp
# copy a bunch of jars into this temporary directory
for i in *.jar; do
    unzd "$i"
done

find . -type f \( -iname '*.jar' -o -iname '*.zip' \) -print0 \
    | xargs -0 -n1 unzip -qqql \
    | perl -0777 -C -pe 's/.*?\/?(.*\.(.*))?/\2/g' \
    | sed '/^$/d' \
    | sort \
    | uniq -c \
    | sort -n \
    | awk '{s+=$1; print $0} END {print s}'
# note: the last line is the total count of all files. this isn't by any means perfect, but whatever.

Here are the results from that:

% of files in jar file extension
51.13% .class
31.34% .json
10.01% .png
2.5% .nbt
1.69% .ogg
0.51% .MF
0.4% .jar
0.28% .mcmeta
0.23% .xdelta (I have no clue what this file format is)
0.23% .md5
0.21% .properties
0.14% .at
0.1% .txt
0.1% .accesswidener
0.09% .xml
0.07% .md
0.06% .js

Based on this, I think it might be reasonable to consider adding optimization processes for the following files:

It's probably not worth it to consider additional compressors for files not in that table as they appear so infrequently it just won't make much of a difference.

Do note however, that this is not based off of the size of the files in the jar, rather just their count. may make something basic to calculate this using the size later.

solonovamax commented 7 months ago

To elaborate on

  • Explode nested jars and recursively repack them. however, in those nested jars only ever STORE files. then add DEFLATE compression at the final level.
  • Sort files by their extension. When files of the same type are beside each other in a nested jar file, it will be able to be better compressed. (based on some rudementary testing, this combined with the previous item, can drastically improve the compression at times)

here are a few jars which had significant gains from this (using Detonater) (this was after having been processed with mc-repack):

Mod Version % saved Size after mc-repack Size after detonater
Fabric Language Kotlin 1.10.10+kotlin.1.9.10 34.83% 6.40M 4.17M
Farsight 1.20.1-4.1 33.84% 361.15K 238.92K
Boat Item View 1.20.1-0.0.5 33.57% 997.18K 662.36K
Quartz Elevator 2.2.5+1.20 32.78% 1019.87K 685.46K
Appleskin mc1.20.1-2.5.1 32.35% 1.02M 699.90K
Cardinal Components API 5.2.2 31.71% 215.76K 147.34K
Dawn 5.0.0 29.11% 1.34M 966.34K
Fabric API 0.90.7+1.20.1 28.13% 1.96M 1.41M
BeaconOverhaul 1.8.4+1.20 27.11% 355.19K 258.89K
Blur 3.1.0 26.30% 155.12K 114.32K
Highlight 1.20-2.0.1 25.92% 274.70K 203.48K
Graves 3.0.0+1.20.1 25.80% 1.76M 1.31M
Create 0.5.1-d+mc1.20.1 (Prominent fork) 9.13% 22.01M 20.00M
LibZ 1.0.2 21.56% 2.00M 1.57M
Industrial Revolution 1.16.5-BETA 7.73% 4.65M 4.29M
CC Tweaked 1.20.1-fabric-1.108.3 10.85% 3.14M 2.80M
Tom's Simple Storage 1.20-1.6.5 23.58% 1.38M 1.06M
Zenith Attributes 0.0.6 24.44% 6.61M 4.99M

so, there are definitely significant savings to be had here by doing this. and, I didn't even run this for all my mods, just a smaller subset, as detonater is kinda slow lol

the mods that primarily benefit from this change are the ones which bundle many libraries in them.

szeweq commented 7 months ago

Thanks for the details! Your proposal will definitely help improving mod(pack) sizes.

I will work on optional directory skips in ZIP/JAR files. This removal will already make the mods smaller. This would make things simpler because the library (mc-repack-core) also works with a file system, where directories must be created before saving a minified file.

The new file types you mentioned can be easily added for minification or recompression. I was testing mainly on Forge mods so I may overlook file types like .at, .accesswidener or .md. There is a lot of mods to check and I must determine the most used file formats to be supported.

The jar-in-jar repacking situation is very tricky. It may need a completely new minifier with customizible options. I will try to make it possible.

New separate issues will be made. Thanks again for using MC-Repack!

solonovamax commented 7 months ago

The new file types you mentioned can be easily added for minification or recompression. I was testing mainly on Forge mods so I may overlook file types like .at, .accesswidener or .md. There is a lot of mods to check and I must determine the most used file formats to be supported.

the mods I used to generate that list were mainly fabric mods, as I just used the modpacks on my laptop. I have a few more instances on my desktop, so I can re-run it on there when I get back home.


also, I don't think that .jpeg/.jpg, .webp, or .avif files are particularly common in mods, as they mostly stick to pngs, however you could possibly add a compressor for those, doing smth similar to what rimage is doing. Assuming it's not a large amount of work and doesn't bloat the binary too much. tbh, if it's anything more than like 30 mins or anything past like 500kb-1mb, it's not worth it imo lol (going from 1.9mb to like 5mb for some compressors that will be rarely used isn't particularly worth it, tbh. just useless bloat that almost never gets run)


The jar-in-jar repacking situation is very tricky. It may need a completely new minifier with customizible options. I will try to make it possible.

you could do smth similar to the following: have a method called for jars called smth like repack_jar, which takes an optional parameter, deflate, defaulting to true. then, if you're already in a jar, just invoke it with false lol. that's similar to detonater did (when not in a jar, when in a jar), though unsure how feasible this is given your code structure. may need to pass some kind of a context around indicating if you're in a jar or in an fs.

but for the sorting, you could just sort all of them.


Also, I'm interested in using this as part of a gradle plugin I'm making (I was originally planning to do all these things myself, and was looking at making an oxipng kotlin/java jni wrapper, but then I found this and this is honestly basically what I need) would you be willing to work with me to make a jni wrapper for this (not asking you to do it all on your own lol, bc that's cringe), so that instead of having to bundle the cli & invoke the cli, I could just use it via jni.

solonovamax commented 7 months ago

yo, unsure if you saw this or not bc it was a weekend, so imma bump it lol

szeweq commented 7 months ago

Sorry for late response.

I would rather not make a library for a specific environment. I will keep maintaining CLI and core library. There is another project I am working on and mc-repack may not receive a release for a couple of weeks.

solonovamax commented 7 months ago

Sorry for late response.

I would rather not make a library for a specific environment. I will keep maintaining CLI and core library. There is another project I am working on and mc-repack may not receive a release for a couple of weeks.

wdym a library for a specific environment?

also, no worries, take your time 👍

szeweq commented 7 months ago

I meant JNI "glue" for JVM.

solonovamax commented 7 months ago

I meant JNI "glue" for JVM.

yeah, I just mainly want to make a jni interface bc it's a lot more convenient to have this applied as a gradle plugin rather than a cli library