xiaoyin0208 / lz4

Automatically exported from code.google.com/p/lz4
0 stars 0 forks source link

lz4c has inconsistent command line options #83

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
I apologize if this is a little whiny, but I'd like to have upstream's opinion 
on this documented, regardless of the outcome.

Many ubiquitous compression utilities (gzip, bzip2, xz, lzma, lzop), all 
support a small number of common flags which can be counted on to act the same. 
Examples:

-d decompress
-c write output to stdout
-1..-9 level of compression

This makes for a consistent interface which authors of command line utilities 
can count on, and easily support many flavors of compression by simply changing 
the binary name.

While lz4c manages to support the simple task of compressing a stream on 
standard input and writing it to standard output without any flags, I suspect 
that this is only by coincidence (as it's simply good UI). Any non-trivial 
compression operation or decompression operation will fail this test. Please 
consider adopting the above "standard" which has existed for decades. I 
understand that this may be difficult thing to do for a pre-existing utility.

For some background, I'm the maintainer of Arch Linux's initramfs creation 
utility, mkinitcpio. The project includes a tool called lsinitcpio which needs 
to be able to decompress images. With the release of linux 3.11, I looked into 
adding support for lz4 compression to mkinitcpio and was saddened to find this 
inconsistency in lz4c. Of course, I can code around this with some special 
casing, but I think it would be in the best interests of shell scripters 
everywhere.

Original issue reported on code.google.com by d...@falconindy.com on 2 Sep 2013 at 9:28

GoogleCodeExporter commented 8 years ago
This issue is known.
A little bit of history : lz4c is the continuation of lz4demo, which was 
initially created just as a demo about how to use lz4 algorithm. It wasn't 
expecting to become mainstream.
Nonetheless, overtime, a few command line arguments got defined.

lz4c is still evolving to support an increasing number of gzip commands.
For example, the next version will support -f (overwrite output) commands. It 
will also support -z to mean force compression (although, after verification, 
it's not a gzip option, but an lzma one).

lz4c can also most probably be adapted to support the following simple commands 
:
-L     display software license
-q     suppress all warnings
-V     display version number

Beyond that point, a few commands are difficult to integrated into lz4c without 
breaking compatibility for existing users and scripts. I've already heard that 
"previous users are not really that important". But I feel there is great value 
in preserving behavior for existing users.

Getting into the details, here are the situations where I foresee some 
difficulties :
=> for lz4c, -c0 is fast compression, and -c1 is high compression. These are 
probably the oldest commands defined.
I could support "-c" alone to mean "force output to stdout, even if it is a 
console", and then "-1" to "-9" compression level (keeping in mind lz4 has only 
2 compression levels currently). But the combination of both commands needs to 
support existing -c0 and -c1 settings, which makes them a bit "strange". Maybe 
it's not really an issue for gzip users, since gzip does not support aggregated 
commands, and therefore -c0 and -c1 do not exist for them.

=> lz4c compresses just one stream. It does not compress "several files" or 
"directories", neither filenames nor stamps. (tar is expected to be used for 
this role). All options related to files, directories and stamps can't be 
supported(-l, -n, -N).

=> lz4c preserves original file by default. gzip deletes it by default. xz & 
lzma introduce an option '-k' to preserve input files.

=> -f option is more than "overwrite output". It also allows "compress console 
input"  for instance (but it's unclear to me what does happen to output in this 
example)

So, to sum up, 
maybe it's possible to support a "sufficient" level of commands, without 
targeting 100% identical to gzip.

Original comment by yann.col...@gmail.com on 3 Sep 2013 at 10:11

GoogleCodeExporter commented 8 years ago
I have a large set of scripts that auto-detect which decompression program to 
use by the file suffix and swap out a variable for the program name to match. 
All supported decompression tools (xz, bzip2, gzip, lzop) use the "program_name 
-dc" format to decompress to stdout. lz4 should support the basic command-line 
switch standards that every other data stream compression tool supports, else 
it becomes a special-case program and an "odd one out" as one can no longer use 
'sed' to produce a new script that uses lz4 instead of lzop, for example.

It may break existing scripts, but it's better to correct this error now than 
to perpetuate it for the sake of supporting the relatively small base of 
software that touches lz4 command-line tools directly. I would also suggest 
that a compromise can be done at build time: add an --with-legacy-options to 
the configure process that will compile the software with these old, 
non-standard switch functions, and anyone who desperately needs such switches 
can recompile a custom non-standard build of lz4 to accommodate their old 
scripts for a while longer.

Original comment by nctrit...@gmail.com on 5 Sep 2013 at 4:51

GoogleCodeExporter commented 8 years ago
Here is a current WIP of the next version of lz4, which tries to address this 
issue.

The CLI lz4c supports an extended range of gzip parameters, it also supports 
legacy lz4 commands.
Perhaps more importantly for this thread, a new CLI is defined, called lz4 
(notice the missing final 'c'), which get rid of conflicting lz4 legacy 
arguments to replace them by gzip/xz compatible ones.

It's not final version yet, but seems in good enough shape for testings now.

Regards

Original comment by yann.col...@gmail.com on 6 Sep 2013 at 11:52

GoogleCodeExporter commented 8 years ago
With some minor corrections

Original comment by yann.col...@gmail.com on 6 Sep 2013 at 4:53

Attachments:

GoogleCodeExporter commented 8 years ago
Awesome! Thanks so much for taking this on. I guess it was a bit arrogant to 
think that I'd be the first person to bring this up.

Any chance that the final version of lz4 will land with a -c flag?

Original comment by d...@falconindy.com on 7 Sep 2013 at 4:30

GoogleCodeExporter commented 8 years ago
The lz4 version supports -c flag.

Original comment by yann.col...@gmail.com on 7 Sep 2013 at 8:18

GoogleCodeExporter commented 8 years ago
So sorry -- I see it now. Thanks again.

Original comment by d...@falconindy.com on 7 Sep 2013 at 1:56

GoogleCodeExporter commented 8 years ago

Original comment by yann.col...@gmail.com on 7 Sep 2013 at 7:28

GoogleCodeExporter commented 8 years ago

Original comment by yann.col...@gmail.com on 9 Sep 2013 at 9:10

GoogleCodeExporter commented 8 years ago
Completed into r103.

Original comment by yann.col...@gmail.com on 9 Sep 2013 at 9:10