Closed tigt closed 9 years ago
Excellent question! It is whatever the default level is, but I can’t find that in the Node documentation either. According to zlib documentation (http://www.zlib.net/manual.html) the default level they use is 6, so if whoever wrote the Node interface did not change it, that should be it.
Are you suggesting 9 based on personal experience? After ten seconds of research, I have found a benchmark showing that at level 9 the compression time gets multiplied by 6 but the output size changes just by a few percentage points http://tukaani.org/lzma/benchmarks.html .
Since somebody else probably wants to tweak compression levels, I would suggest we choose a sensible default and let people override it. The most work is switching to the Gzip class and using streams directly. Would you like to write the patch yourself (and enter the illustrious hall of fame of metalsmith-gzip contributors, enhancing your CV to the eyes of every Fortune 500 HR department out there)?
P.S. I like the goblin wearing pants to fit into society
If I were anywhere close to being a good enough programmer, I'd try it, but I finally understood object-oriented like, yesterday. I could try, but don't hold your breath on me accomplishing anything.
Code could be shared from beatgammit/gzip-js or jstuckey/gulp-gzip maybe? The official Node zlib documentation does have an options
object which allegedly takes level
, but doesn't seem keen on an example.
Also, yeah; level 9 compared to even level 8 is an big increase in encode time, but I'm not sure if that's a dealbreaker for me if I get to multiply those tiny savings across viewers. (NearlyFreeSpeech.NET & not much budget.)
Also, thanks! I didn't think people would be looking at that URL; I really should finish that theme so it doesn't look broken.
Ok, I thought you were a developer (at least you’re technical enough to care about compression levels). The best thing is to make them configurable. Looking at the Node documentation, we need to create a Gzip
object with zlib.createGzip(options)
and then use that to perform the compression.
The problem is that it seems to work with streams and in metalsmith we have the file contents as a buffer. https://github.com/beatgammit/gzip-js does not help because it reimplements the gzip algorithm from scratch in pure JS. It should not be exceedingly difficult to implement the changes. I can do it myself, but If you are learning and feel tempted to try, I can review your attempt and guide you.
Well, I'm terrified, but maybe that's a good sign. I have a fair understanding of JavaScript in-browser, but I'm pretty unfamiliar with the Node environment.
If I understand your suggestion, you mean something like:
var Gzip = zlib.createGzip(options);
Where options
is an object defaulting to:
var options = {
flush: zlib.Z_NO_FLUSH,
chunkSize: 16*1024,
windowBits: 15, /*not sure about this one*/
level: 6,
memLevel: 8,
strategy: zlib.Z_DEFAULT_STRATEGY
};
I looked up streams vs. buffers and it seems like streams are what node.js uses if left to its own devices (some sort of string that doesn't mind having multiple things done to it at once), and buffers are when you specify it's some glob of binary data. The documentation mentions zlib.gzip(buf, callback)
where I can only assume buf
is the location of said buffer, and callback
is some function that handles whatever this method returns. (Like, if zlib.gzip returned true
it would log a happy message, if something went wrong it would return an error object?)
I'm currently reading over your existing code and trying to figure out where to integrate this, but this is going to take some cross-referencing before I have an idea (for starters, I had to look up !!whatever
just now). It looks like maybe I would just attach some of the configurables to the existing options
object? Is that a separate file?
Hey! Great that you‘re taking up the challenge! I seriously expected you would tell me to do it myself. You’re correct about using var Gzip = zlib.createGzip(options)
. From the documentation, it looks like we should then obtain as stream representing the file to compress and do
stream.pipe(Gzip).pipe(out)
for a suitable value of out.
This should replace the call to zlib.gzip(data.contents, function(err, buffer))
. You’re correct about callback; but what gzip
‘returns’ is the compressed output, which the callback accesses in buffer
.
The user defines the contents of the options
object, they are not configured inside the plugin. There is no way to enforce at the language level that the options have certain fields defined, so what authors do is run checks before trying to access a certain property. Here we have to
Writing a plugin can be confusing because the framework (metalsmith) is doing most of the work. They say ‘you call a library, a framework calls you’. So you have function arguments like options
, file
, metalsmith
, done
that seem to come from nowhere. But in fact they have been created by metalsmith, so you need to look at the metalsmith documentation to see what they contain.
The problem with streams vs buffers is that you cannot pass a buffer to a function requiring a stream and all that metalsmith is handing us down is a buffer, so we need to somehow get a buffer from the stream, then the stream back to the buffer (so that metalsmith and the other plugins down the chain can do something with it). You could look at the code inside zlib (since probably zlib.gzip is doing just that conversion) or use the second answer here http://stackoverflow.com/questions/16038705/how-to-wrap-a-buffer-as-a-stream2-readable-stream
I suggest you start by replacing line 24 in index.js and try to get the the compressed contents into the buffer
variable by using streams, starting from the data
buffer.
Nothing to be terrified about! You can’t break anything, As you probably noticed, writing this sort of code is more a question of philology than pure puzzle-solving ability, as you spend most of your time trawling through the documentation. For non-public code, the documentation is often much more lacking, in that case you spend hours inspecting the existing code to extract its intent.
I had been sending the email alerts to spam by accident. Sorry about that.
It turns out most of the suggestions to be found on the web on how to handle streams are probably innacurate, so I had to rewrite it from scratch. We could have salvaged your original code, but for no fault of yours it was totally the wrong approach.
If I'm seeing it right, the way to convert a stream into a buffer is to have the output stream from the zlib extension write to an array, then when it fires the end
event you concatenate everything and "clone" it to the object you return to metalsmith?
Yes. Do you want to try whether it works for you? I think you can use this branch with npm instead of a regular release.
Release 0.3 has been pushed to npm. Compression levels can be set via the gzip key in the options object . For example: compress({ gzip: { level: 6 }})
.
Sorry about my tardiness.
As you can probably see from the comparision above, this definitely ended up being worth it for larger files. Thank you so much for your patience and time! I'll have to come back after a few months of learning JS better and see if I can help contribute to something else of yours.
Excellent! I am glad that it works. I try to test it, but without others confirming with their own setups, it’s hard to know whether things work fine.
You’re welcome to try again, although I am not sure I have that many projects to contribute to,
What gzip compression level does this plugin operate at? I tried checking zlib's page for any information, but I'm not sure I found the right one.
If I'm going to be zipping files only once, I might as well gzip at level 9 for a few extra percentages in savings / decode effort.