Closed Sanmayce closed 8 years ago
Thanks for reporting that. I only tested under linux, have not tried on Windows yet. It seems that unpacked files were written in text mode instead of binary mode. Probably reason is #define O_BINARY 0, I think that should be fixed in my latest commit (but not tested yet).
As for compression options, current settings for level -9 is practical maximum. It is possible to compress better by a few bytes, by increasing values in pack.c source, but compression speed can quickly become extremely slow and compression ratio is practically the same.
It seems you fixed it already:
D:\LZOMA___\lzoma-master>gcc -O2 -pipe pack.c divsufsort.c -o pack
pack.c: In function 'main':
pack.c:937:3: warning: implicit declaration of function 'close' [-Wimplicit-function-declaration]
close(ifd);
^
D:\LZOMA___\lzoma-master>gcc -Os -fomit-frame-pointer -std=c99 -Os -pipe unpack.c -o unpack
D:\LZOMA___\lzoma-master>dir *.exe
Volume in drive D is S640_Vol5
Volume Serial Number is 5861-9E6C
Directory of D:\LZOMA___\lzoma-master
01/15/2016 07:36 PM 102,475 pack.exe
01/15/2016 07:36 PM 51,050 unpack.exe
2 File(s) 153,525 bytes
0 Dir(s) 85,609,353,216 bytes free
D:\LZOMA___\lzoma-master>cd..
D:\LZOMA___>dir
Volume in drive D is S640_Vol5
Volume Serial Number is 5861-9E6C
Directory of D:\LZOMA___
01/15/2016 07:36 PM <DIR> .
01/15/2016 07:36 PM <DIR> ..
01/15/2016 07:36 PM <DIR> lzoma-master
01/15/2016 07:34 PM 51,390 lzoma-master_2016-Jan-15_19h35m.zip
01/15/2016 04:50 PM 2,091,543 The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt
01/15/2016 05:07 PM 3,265,536 University_of_Canterbury_The_Calgary_Corpus.tar
3 File(s) 5,408,469 bytes
3 Dir(s) 85,603,987,456 bytes free
D:\LZOMA___>lzoma-master\pack.exe -9 The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.lzoma
got 2091543 bytes, packing The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt into The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.lzoma...
stats noe8 2064343 e8 2064343
reverted e8
init done.
4095 left
res=4944198
res bytes=618025
out bytes=618024
closing files let=21327 lz=205920 olz=5750
bits lzlit=232997 let=170616 olz=18192 match=3487236 len=1035120
D:\LZOMA___>lzoma-master\unpack.exe The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.lzoma The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.unpa
ck
D:\LZOMA___>dir the*
Volume in drive D is S640_Vol5
Volume Serial Number is 5861-9E6C
Directory of D:\LZOMA___
01/15/2016 04:50 PM 2,091,543 The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt
01/15/2016 07:40 PM 618,033 The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.lzoma
01/15/2016 07:40 PM 2,091,543 The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.unpack
3 File(s) 4,801,119 bytes
0 Dir(s) 85,601,275,904 bytes free
D:\LZOMA___>fc The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt.unpack /b
Comparing files The_Secret_Teachings_of_all_Ages_-_Manly_Palmer_Hall.epub.txt and THE_SECRET_TEACHINGS_OF_ALL_AGES_-_MANLY_PALMER_HALL.EPUB.TXT.UNPACK
FC: no differences encountered
D:\LZOMA___>
Thanks for reporting that.
Oh, LZOMA kicked my ass, it is so cool. Hope you refine it to the point it becomes a paragonic performer.
It is possible to compress better by a few bytes, by increasing values in pack.c source, but compression speed can quickly become extremely slow and compression ratio is practically the same.
Okay, I thought that changing
{3,100,1000}
with some higher values would tighten the ratio. Also, can you say what is the size of your "window", and have you thought of making it 28bit (256MB) as Nakamichi's one.
Window size is currently set to 16MB. It is possible to make that larger (needs to tune some variables in lzoma.h, so far I tried 64MB). Compressor memory usage is currently too large, 33*window size. I have some ideas how to reduce compressor memory usage, probably will implement option to set window size after that.
Hi alef, wanted to quickly compare your "binary" tight compressor versus my "textual" semi-tight one, so some feedback:
Is the problem in pack or unpack? Hope, you are gonna fix it since your approach is so tight and promising.
Tried two Russian texts as well:
Oh, and could you give (in comments) the best options/values for textual data.