DaanVanVugt / h5pickle

Wrapper for h5py with pickle capabilities
MIT License
25 stars 8 forks source link

Python 2.7 and 3.4 compatibility #4

Closed djhoese closed 4 years ago

djhoese commented 6 years ago

I found this package when looking at https://github.com/dask/dask/issues/922. I added it to one of my packages and noticed that travis CI was failing for python 2.7 and 3.4 due to some syntax that isn't valid in those versions. I'm willing to make a PR to fix the issues but wanted to check if that was something you were interested in supporting first. Thoughts?

djhoese commented 6 years ago
Best match: h5pickle 0.3
Processing h5pickle-0.3.tar.gz
Writing /tmp/easy_install-j6f7vfw4/h5pickle-0.3/setup.cfg
Running h5pickle-0.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-j6f7vfw4/h5pickle-0.3/egg-dist-tmp-0bo6df4a
  File "build/bdist.linux-x86_64/egg/h5pickle/__init__.py", line 137
    return self.init_args, {**self.init_kwargs, 'skip_cache': self.skip_cache}
                             ^
SyntaxError: invalid syntax

And:

Best match: h5pickle 0.3
Processing h5pickle-0.3.tar.gz
Writing /tmp/easy_install-eL11Lb/h5pickle-0.3/setup.cfg
Running h5pickle-0.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-eL11Lb/h5pickle-0.3/egg-dist-tmp-SWJ_Y6
  File "build/bdist.linux-x86_64/egg/h5pickle/__init__.py", line 105
    def __new__(cls, *args, skip_cache=False, **kwargs):
                                     ^
SyntaxError: invalid syntax
DaanVanVugt commented 6 years ago

Hi David,

I would not mind supporting that, feel free to make a PR!

Daan

Le mar. 10 avr. 2018 16:20, David Hoese notifications@github.com a écrit :

Best match: h5pickle 0.3 Processing h5pickle-0.3.tar.gz Writing /tmp/easy_install-j6f7vfw4/h5pickle-0.3/setup.cfg Running h5pickle-0.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-j6f7vfw4/h5pickle-0.3/egg-dist-tmp-0bo6df4a File "build/bdist.linux-x86_64/egg/h5pickle/init.py", line 137 return self.init_args, {**self.init_kwargs, 'skip_cache': self.skip_cache} ^ SyntaxError: invalid syntax

And:

Best match: h5pickle 0.3 Processing h5pickle-0.3.tar.gz Writing /tmp/easy_install-eL11Lb/h5pickle-0.3/setup.cfg Running h5pickle-0.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-eL11Lb/h5pickle-0.3/egg-dist-tmp-SWJ_Y6 File "build/bdist.linux-x86_64/egg/h5pickle/init.py", line 105 def new(cls, *args, skip_cache=False, **kwargs): ^ SyntaxError: invalid syntax

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Exteris/h5pickle/issues/4#issuecomment-380116837, or mute the thread https://github.com/notifications/unsubscribe-auth/AACMaNPH-8n-u1g8Ch3jil02kOy10_ovks5tnL-hgaJpZM4TOWca .

djhoese commented 6 years ago

Hm python 2.7 support might not be completely possible given the use of __getnewargs_ex__ which was added in Python 3.4 (https://www.python.org/dev/peps/pep-3154/#calling-new-with-keyword-arguments). I'll fix both syntax errors and see what happens.

DaanVanVugt commented 6 years ago

I have set up CircleCI tests for python 3.4. We only have one very simple test for now. On python 3.4 we get File "/home/circleci/repo/tests/test_pickling_single_process.py", line 21, in test_readonly_skip_cache g = pickle.loads(pickle.dumps(f)) ValueError: must use protocol 4 or greater to copy this object; since __getnewargs_ex__ returned keyword arguments. I haven't set up 2.7 yet

djhoese commented 6 years ago

@Exteris My understanding is that that is because the default protocol is protocol 3 (https://docs.python.org/3/library/pickle.html#data-stream-format). I'd actually be surprised if it worked with any current version of python, but obviously it does because you've used it before.

djhoese commented 6 years ago

It doesn't make sense to me why __getnewargs_ex__ is being called in python 3.4. The pickle docs say that is only called in python 3.6 for protocol 2 and 3...I think. I suppose it was backported: https://docs.python.org/3.6/library/pickle.html#object.__getnewargs_ex__

Either way I don't see changes between 3.4 and 3.6 that would make it work with one and not the other.

djhoese commented 6 years ago

Just to prove a point I got tests to pass with python 3.4 and 3.5 by doing something super hacky that I don't believe is good code what so ever and should not be considered as something that should be added to this package and won't work if the h5py class is passed any kwargs:

diff --git a/h5pickle/__init__.py b/h5pickle/__init__.py
index c29c13c..c1062a2 100644
--- a/h5pickle/__init__.py
+++ b/h5pickle/__init__.py
@@ -106,7 +106,11 @@ class File(h5py.File):
         """Create a new File object with the h5open function, which memoizes
         the file creation. Test if it is still valid and otherwise create a new one.
         """
-        skip_cache = kwargs.pop('skip_cache', False)
+        if isinstance(args[-1], tuple) and args[-1][0] == 'skip_cache':
+            skip_cache = args[-1][1]
+            args = args[:-1]
+        else:
+            skip_cache = kwargs.pop('skip_cache', False)
         hsh = arghash(*args, **kwargs)
         if skip_cache or hsh not in cache:
             self = object.__new__(cls)
@@ -136,8 +140,9 @@ class File(h5py.File):

     def __getnewargs_ex__(self):
         kwargs = self.init_kwargs.copy()
-        kwargs['skip_cache'] = self.skip_cache
-        return self.init_args, kwargs
+        #kwargs['skip_cache'] = self.skip_cache
+        args = self.init_args + (('skip_cache', self.skip_cache),)
+        return args, kwargs

     def close(self):
         """Override the close function to remove the file also from the cache"""

So...the issue is that Python 3.4 and 3.5 don't support __getnewargs_ex__ completely with protocol 3. It was added in python 3.6: https://bugs.python.org/issue24164. So this package can only support 3.6+, at least that's what my findings show.

djhoese commented 6 years ago

If you added 3.5 to the CIs, I would expect them to fail even before my changes.

DaanVanVugt commented 6 years ago

If we remove the caching functionality we can go back to the old setstate and getstate methods, which should work in 2.7 also. This might not be so important in many cases, but for my use case with dask it had a great impact.

Or, we can remove support for keyword arguments (which will cause an incompatibility with h5py driver keyword arguments... http://docs.h5py.org/en/latest/high/file.html, but those are rarely used I think)

DaanVanVugt commented 4 years ago

I think we will just not support 3.5 and below for now. If anyone has a good solution of course it is welcome.