miurahr / pyppmd

pyppmd provides classes and functions for compressing and decompressing text data, using PPM (Prediction by partial matching) compression algorithm variation H and I.2. It provide an API similar to Python's zlib/bz2/lzma modules.
https://pyppmd.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
8 stars 3 forks source link

Ppmd8_DecodeSymbol should be received by int, not unsigned char. Then stop decoding if <0 #32

Closed cielavenir closed 3 years ago

cielavenir commented 3 years ago

Describe the bug

https://github.com/miurahr/pyppmd/blob/v0.15.2/src/ext/_ppmdmodule.c#L1451 Ppmd8_DecodeSymbol should be received by int, not unsigned char

if Ppmd8_DecodeSymbol returns <0, decoding should terminate immediately. and char -1 and int -1 are different (this is similar to that fgetc() should be received by int, not char).

it will help:

Additional context

Additional context 2

Actually I wrote ppmd handler for zipfile but additional end-marker creates incompatible stream.

miurahr commented 3 years ago

Could you propose the change for your proposal over #33 (when merged)? It is OK to drop "end marker".

cielavenir commented 3 years ago

It is in https://github.com/cielavenir/pyppmd/commits/UseEndmarkProperly2 , but see https://github.com/miurahr/pyppmd/pull/33#issuecomment-894676975 first.

miurahr commented 3 years ago

Note: An implementation of "end mark" is partially compatible with unrar does.

miurahr commented 3 years ago

33 changed to do it. (don't touch end mark code)

cielavenir commented 3 years ago

@miurahr

Firstly, both 7z and rar uses PPMdH and zip uses PPMdI.

Then actually ppmd has quite many dialects - although 7z and rar uses PPMdH, the rangecoders are different. It is also different from the original PPMd to unpack PPMd archive format described in http://www.compression.ru/ds/ .

So it is free to have different endmark handling [edit: across softwares].

Maybe adding option to set the endmark handling is an idea.

miurahr commented 3 years ago

Now you can pass endmark option to encoder and decoder.

ref: #39

cielavenir commented 3 years ago

@miurahr thank you

cielavenir commented 3 years ago

@miurahr I confirmed the compatibility of current pyppmd and I was able to release https://pypi.org/project/zipfile-ppmd/ as zipfile module patcher, which works the same way as zipfile-zstd. Thank you.