openzim / python-libzim

Libzim binding for Python: read/write ZIM files in Python
https://pypi.org/project/libzim/
GNU General Public License v3.0
63 stars 22 forks source link

Ability to choose compression algoritm #75

Closed rgaudin closed 4 years ago

rgaudin commented 4 years ago

libzim 6.1.8 added the ability to choose the compression type on creation. This adds support for it in the Creator and other required parts.

On user API, the main change is a compression argument to the Creator. This can receive either a string (default, none, zip, lzma, zstd) or a member of the new Compression enum (members named after strings above)

This enum converts to zim::CompressionType and that is forwarded down to the libzim.

Breaking change

I chose to add the new compression argument to the Creator before the min_chunk_size as it seems there might better chances people would customize it than the chunks size. I might be wrong or it might just not be wanted. Let me know and I'd change that. Because of this change, two tests using positional arguments were changed.

--

Note: this assumes https://github.com/openzim/libzim/issues/371 is fixed.

codecov[bot] commented 4 years ago

Codecov Report

Merging #75 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master       #75   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         2           
  Lines           96        99    +3     
=========================================
+ Hits            96        99    +3     
Impacted Files Coverage Δ
libzim/writer.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 08bed59...5806c96. Read the comment docs.

mgautierfr commented 4 years ago

We already discuss a bit in https://github.com/openzim/libzim/issues/371 but I think that the wrapper should simply don't provide the "zimCompDefault" value. (The next libzim api will probably change in the same time than libzim2).

On the min_chunk_size, very few (if not none) user will set this. We could move in a "configuration" method on the creator instead of having it in the constructor (as it is made on libzim side).

rgaudin commented 4 years ago

@mgautierfr, I've removed zimcompDefault and now defaults to lzma.

As per min_chunk_size, the Creator now has a configuration property returning a dict with only min_chunk_size at the moment. Let me know if that's how you want it.

mgautierfr commented 4 years ago

I'm not sure a user can change the min_chunk_size the way you've done it.

I was thinking about a simple set_min_chunk_size method (or a propery) on the Creator, calling the cpp setMinChunkSize. But if it is called after startZimCreation it is useless.

It appears that the simpler is to keep the min_chunk_size argument in the __init__ after all.

rgaudin commented 4 years ago

Ah ok, didn't realize you meant post-init. OK, dropping that commit then.

rgaudin commented 4 years ago

ready