stephan-hof / pyrocksdb

Python bindings for RocksDB
BSD 3-Clause "New" or "Revised" License
150 stars 170 forks source link

Support unstable ZSTD #36

Open Downchuck opened 9 years ago

Downchuck commented 9 years ago

I don't know if you want to bring this in yet, but the current RocksDB has a compression stub now for ZSTD as kZSTDNotFinalCompression.

zstd is quite a bit faster than zlib with a similar compression ratio. It's not yet stable/mature. Same author as LZ4.

stephan-hof commented 9 years ago

Yeah I have seen this. Do they already have a official release supporting that ? Or is it still in master only ? I'm reluctant to add stuff which is only in rocksdb master, because it could change. However if its inside a 'release' I would be happy to add it.

Downchuck commented 9 years ago

They seem to have more of a continuous release cycle. Shows it in the 4.0.0 release: https://github.com/facebook/rocksdb/blob/master/HISTORY.md#400-992015

stephan-hof commented 9 years ago

I see, but they still say its experimental. I mean adding it to pyrocksdb should be very trivial, its more a question about using/expose experimental stuff. They say the format can change => On an upgrade your entire DB is not usable anymore. I'm a fan of only supporting the stable stuff, so users of the library cannot trap themselves. Do you really need it now ? Would it be OK for you to wait until the rocksdb guys say: "Its stable" ?

Downchuck commented 9 years ago

It's the ZSTD guy that's controlling stability. I think it's reasonable to add once you're comfortable releasing a build that targets/supports 4.0.x. It'd certainly be "zstd_experimental_compression" as the option. For my own use cases, it's completely fine to lose compatibility if the standard changes.

Downchuck commented 9 years ago

My recommendation is to cut a 3.x branch from master, then cut a 4.x branch, and keep master up with rocksdb as they make releases, with 4.x following master when appropriate. As for updating PyPi -- I'd wait until both Ubuntu and Homebrew show some sign of accepting a new release before pushing to PyPi.

(that's literally "3.x" as the branch name")

Downchuck commented 9 years ago
diff --git a/docs/api/options.rst b/docs/api/options.rst
index a5e9d07..81b596d 100644
--- a/docs/api/options.rst
+++ b/docs/api/options.rst
@@ -739,6 +739,7 @@ CompressionTypes
     .. py:attribute:: bzip2_compression
     .. py:attribute:: lz4_compression
     .. py:attribute:: lz4hc_compression
+    .. py:attribute:: zstdnotfinal_compression

 BytewiseComparator
 ==================
diff --git a/rocksdb/_rocksdb.pyx b/rocksdb/_rocksdb.pyx
index 95422fc..a412ced 100644
--- a/rocksdb/_rocksdb.pyx
+++ b/rocksdb/_rocksdb.pyx
@@ -700,6 +700,7 @@ cdef class CompressionType(object):
     bzip2_compression = u'bzip2_compression'
     lz4_compression = u'lz4_compression'
     lz4hc_compression = u'lz4hc_compression'
+    zstdnotfinal_compression = u'stdnotfinal_compression'

 cdef class Options(object):
     cdef options.Options* opts
@@ -789,6 +790,8 @@ cdef class Options(object):
                 return CompressionType.lz4_compression
             elif self.opts.compression == options.kLZ4HCCompression:
                 return CompressionType.lz4hc_compression
+            elif self.opts.compression == options.kZSTDNotFinalCompression:
+                return CompressionType.zstdnotfinal_compression
             else:
                 raise Exception("Unknonw type: %s" % self.opts.compression)

@@ -805,6 +808,8 @@ cdef class Options(object):
                 self.opts.compression = options.kLZ4Compression
             elif value == CompressionType.lz4hc_compression:
                 self.opts.compression = options.kLZ4HCCompression
+            elif value == CompressionType.zstdnotfinal_compression:
+                self.opts.compression = options.kZSTDNotFinalCompression
             else:
                 raise TypeError("Unknown compression: %s" % value)

diff --git a/rocksdb/options.pxd b/rocksdb/options.pxd
index aa404d1..3e38a8a 100644
--- a/rocksdb/options.pxd
+++ b/rocksdb/options.pxd
@@ -27,6 +27,7 @@ cdef extern from "rocksdb/options.h" namespace "rocksdb":
         kBZip2Compression
         kLZ4Compression
         kLZ4HCCompression
+        kZSTDNotFinalCompression

     ctypedef enum ReadTier:
         kReadAllTier
Downchuck commented 9 years ago

setup.py is a little strange with the -lz flags, it doesn't have -lz4, not sure if it should or not. something to be aware of. Similar to -lzstd.

Downchuck commented 9 years ago

From ZSTD github: '"Stable Format" is projected sometimes early 2016.'

In the meantime, I do need that ZSTD patch, but I can maintain it in a separate branch. It's just so much faster than zlib.

Downchuck commented 8 years ago

Stable.