samuelcolvin / xdelta3-python

Fast delta encoding in python using xdelta3
Other
34 stars 15 forks source link

Disabling secondary compression #1

Closed zonque closed 6 years ago

zonque commented 6 years ago

I need to disable secondary compression for delta files because does open-vcdiff does not support it. However, I fail to convince xdelta3 to skip secondary compression. The command line tool keeps adding that level of compression even with no option argument to -S, and this python wrapper seems to have a similar issue.

In my code, I currently have

delta = xdelta3.encode(other, image, xdelta3.Flags.SEC_NOALL)

which gives me the following error at run-time:

Traceback (most recent call last):
  File "./pack-image.py", line 15, in <module>
    main()
  File "./pack-image.py", line 11, in main
    delta = xdelta3.encode(other, image, xdelta3.Flags.SEC_NOALL)
  File "/usr/lib64/python3.6/site-packages/xdelta3/main.py", line 89, in encode
    return _xdelta3.execute(new_value, original, flags, 0)
xdelta3.XDeltaError: Error occur executing xdelta3: XD3_INTERNAL

Without xdelta3.Flags.SEC_NOINST, the code works, but the image features secondary compression.

Any idea?

samuelcolvin commented 6 years ago

Afraid not off the top of my head, sounds like it could be an upstream issue in xdelta3 if you get the same with the CLI.

I would submit an issue on that project and see if you get an answer.

This project just uses the standard c api to xdelta3.

I got very little apparent effect from charging the compression setting when I was testing.

It's possible there's a compile flag which switches it on.

zonque commented 6 years ago

Okay, I'll file an issue there and link this one. Can you keep it open here until we have confirmation that it's an upstream issue?

samuelcolvin commented 6 years ago

Fine, but check existing issues there: a few seem related.

samuelcolvin commented 6 years ago

Ok, I've had a closer look and this would appear to be the problem: XD3_SEC flags require a secondary compressor type.

Or to put it another way: if you want no secondary compression you need to set a (primary?) compression flag. I'm no expert on c or xdelta3, I might be wrong.

Anyway, the following avoids the error above:

delta = xdelta3.encode(other, image, xdelta3.Flags.SEC_NOALL & xdelta3.Flags.SEC_LZMA)

You'll need to check with open-vcdiff to see if it works or give details on how to run it so I can check.

zonque commented 6 years ago

delta = xdelta3.encode(other, image, xdelta3.Flags.SEC_NOALL & xdelta3.Flags.SEC_LZMA)

That & can't be right :) The statement basically computes to 0, which is then handled as special case again in xdelta. There's something really broken here, but it's almost certainly not in the python code.

Btw - you can check the image generated with the above statement with xdelta3 printhdr, and it will show the lib fell back to its lzma default.

samuelcolvin commented 6 years ago

:1st_place_medal: to me.

Ok, but stream>msg does get set to XD3_SEC flags require a secondary compressor type, what does that mean?

zonque commented 6 years ago

Yes, that check makes no sense to me. I'm confused.

IMO, that check should read if ((stream->flags & XD3_SEC_NOALL) != XD3_SEC_NOALL).

samuelcolvin commented 6 years ago

Looks from https://github.com/jmacd/xdelta/issues/234 like the problem is particular to xdelta3-python but I'm not exactly sure how to solve it.

I've just pushed a commit with a few commented out c macros. 939a45b29155a8c7b028ab93a75b91c698551c9b

If you uncomment XD3_DEBUG in setup.py and build, then run your code you'll get a fairly verbose output from xdelta3 which might help you debug.

Also (as far as I understand) lzma was always disabled until now with xdelta3, indeed none of the secondary compression algorithms appear to have been working. If you uncomment the two lzma lines in setup.py and build, lzma would appear to work although it's not that simple to confirm.

Let me know if you work out the problem or if there's anything I can do.

zonque commented 6 years ago

Nah, it's all good. I trusted the output of printhdr but it turns out that's tet thing that's broken currently. I guess everything is apart from that - a file generated with python and the code above looks fine when inspected by printhdr of v3.0.9.

Sorry for the noise :)

zonque commented 6 years ago

And thanks for your help!

samuelcolvin commented 6 years ago

no problem.