jpeddicord / askalono

A tool & library to detect open source licenses from texts
Apache License 2.0
256 stars 25 forks source link

Proposal to use zstd instead of gzip #43

Closed Jake-Shadle closed 5 years ago

Jake-Shadle commented 5 years ago

Before I spent time cleaning up our fork and creating a PR for this, I wanted to check if this is something you would actually be interested in taking back at all.

Basically, I just wanted to use zstd compression instead (or in parallel?) with the current gzip compression. The primary reason is a much lower cache size and overall binary size.

.rw-rw-r--@ 1,422,606 jake 29 May  8:56 embedded-cache.bin.gz
.rw-rw-r--@   967,297 jake 20 Jun  9:22 embedded-cache.bin.zstd

.rwxrwxr-x@  8,034,936 jake 28 May 11:57 askalono
.rwxrwxr-x@  7,130,792 jake 20 Jun  9:22 askalono

There is also a slight (~20ms on my machine) improvement on decompression times, but it's probably not going to be noticeable in practice.

jpeddicord commented 5 years ago

Ah! I had almost this exact thought the other day. That would be super cool -- a substantial size reduction and a speedup is a win in my book. Would love to see a PR. :)

Jake-Shadle commented 5 years ago

Cool, good to hear, will hopefully have something next week. Do you want it to support both, or just do a straight replacement?

jpeddicord commented 5 years ago

Replacement is totally fine; no need for both when the format itself is specific to this tool.

Jake-Shadle commented 5 years ago

Awesome, that makes it a lot easier. ☺️

jpeddicord commented 5 years ago

Closing based on your PR (thanks again!)