Closed nemequ closed 9 years ago
Hi Evan,
Unfortunately, I haven't gotten around to making the code work on Linux. I will download an Ubuntu VM and do this eventually! I'll let you know when it works on Ubuntu.
Regards, Mathieu
On Thu, Aug 22, 2013 at 11:14 PM, Evan Nemerson notifications@github.comwrote:
I would like to add a Squash http://quixdb.github.io/squash/ plugin for MCM (issue tracked at quixdb/squash#27https://github.com/quixdb/squash/issues/27), but it seems to be Windows-only at the moment.
After changing the typedefs at Util.hpp:57-58 to use int64_t and uint64_tinstead of int64 and unsigned int64 (since there is no __int64) and commenting out the #include
in Memory.cpp, I still get http://paste.fedoraproject.org/34196/13772378/ on Fedora 19 (with g++ 4.8.1). — Reply to this email directly or view it on GitHubhttps://github.com/mathieuchartier/mcm/issues/1 .
First off, here is a patch to provide alternatives for the Windows-specific functionality: http://pastebin.com/kEFSm7EU. I don't think this should break anything on Windows, but I don't actually have a Windows dev environment to test with, so please do so before pushing.
I'm not really clear on what is going on with MemMap::zero. I get that you're telling Windows that you're temporarily not interested in the memory, but I don't know what is supposed to happen when you become interested again. I looked for calls to the MemMap::zero method to try to figure it out but there weren't any, so I just removed it.
As for the rest...
In file included from CM.hpp:33:0, from MCM.cpp:30: Huffman.hpp: In member function ‘size_t Huffman::Tree
::getAlphabet() const’: Huffman.hpp:55:11: error: ‘code’ was not declared in this scope return code;
I don't see where "code" is declared, so I don't know how this could work anywhere. Nothing actually uses this function (according to grep this is the only occurrence of "getAlphabet" anywhere in the project).
Huffman.hpp: In member function ‘bool HuffmanComp::DeCompress(TOut&, TIn&)’: Huffman.hpp:449:3: error: ‘DeCode’ is not a member of ‘Huffman’ Huffman::DeCode
decoder; Huffman.hpp:449:31: error: ‘decoder’ was not declared in this scope Huffman::DeCode decoder;
Again, not sure how this could work anywhere. This is the only occurrence of "DeCode" anywhere in the project.
In file included from MCM.cpp:30:0: CM.hpp: In member function ‘void CM
::BuildHuffCodes(Huffman::Tree *)’: CM.hpp:563:19: error: ‘huff_codes’ was not declared in this scope tree->getCodes(&huff_codes[0]);
Again, this is the only occurrence of "huff_codes" anywhere, so I don't see how this could work.
With those functions removed, compression /seems/ to work, but decompression segfaults. i'm not sure if that's a result of me removing HuffmanComp::DeCompress or not. The other two seem pretty safe to remove.
The main compiler I use is VS, which doesn't thoroughly check that code in templates makes sense if it is not referenced anywhere. That is probably why you are getting the Huffman errors. As for the MemMap stuff, I'm going to use memmap + MAP_ANONYMOUS. I should have these issues fixed later tonight (PST).
I fixed the g++ compile on windows and tested it, seems to work fine for compression and decompression. I haven't tested it on linux yet however.
I can compile now, so technically this issue is fixed, but decompression still segfaults. Here is the backtrace:
0 buildFromCodeLengths (freqs=0x0, max_depth=16, count=256, lengths=0x6397f0) at Huffman.hpp:321
1 Huffman::readTree<Range7, BufferedStream<4096ul> > (ent=..., stream=..., alphabet_size=alphabet_size@entry=256, max_length=max_length@entry=16) at Huffman.hpp:374
2 0x000000000040dd49 in CM<6ul>::DeCompress<IdentityFilter<BufferedStream<4096ul> >, BufferedStream<4096ul> > (this=this@entry=0x6167c0
, sout=..., sin=...) at CM.hpp:751 3 0x0000000000402d23 in DeCompress<BufferedStream<4096ul>, BufferedStream<4096ul> > (sin=..., sout=..., this=0x6167c0
) at Filter.hpp:70 4 main (argc=4, argv=
) at MCM.cpp:210
This is with enwik8. Compressed file: http://bitshare.com/files/kx0c01xt/enwik8.mcm.html (no options--just ./mcm_gcc enwik8 enwik8.mcm
to compress and ./mcm_gcc -d enwik8.mcm enwik8.unmcm
to decompress).
Do you want me to open up a new issue for the segfault?
The only reason that can segfault is that the huffman tree read back isn't correct. I'll need to investigate and see if I can figure out why.
Fixed
I would like to add a Squash plugin for MCM (issue tracked at https://github.com/quixdb/squash/issues/27), but it seems to be Windows-only at the moment.
After changing the typedefs at Util.hpp:57-58 to use
int64_t
anduint64_t
instead of__int64
andunsigned __int64
(since there is no __int64) and commenting out the#include <Windows.h>
in Memory.cpp, I still get http://paste.fedoraproject.org/34196/13772378/ on Fedora 19 (with g++ 4.8.1).