datreant / datreant.data

convenient data storage and retrieval in HDF5 for Treants
http://datreant.org/
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

add python2/3 compatible pickling #23

Closed kain88-de closed 6 years ago

kain88-de commented 6 years ago

Fixes: #22

I'm not sure how to best test this though. I noticed from my own experience in the last two days that these settings seem to be pretty stable. We should also document this behavior somewhere.

codecov-io commented 6 years ago

Codecov Report

Merging #23 into develop will increase coverage by 0.27%. The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop      #23      +/-   ##
===========================================
+ Coverage     70.7%   70.97%   +0.27%     
===========================================
  Files            9        9              
  Lines          314      317       +3     
  Branches        47       47              
===========================================
+ Hits           222      225       +3     
- Misses          75       76       +1     
+ Partials        17       16       -1
Impacted Files Coverage Δ
src/datreant/data/pydata.py 100% <100%> (ø) :arrow_up:
src/datreant/data/pddata.py 78.78% <0%> (ø) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 89f8937...6d133c6. Read the comment docs.

kain88-de commented 6 years ago

Having full python 2/3 compatibility will be hard for pickles. Also due the internal changes from python 2 to 3. One option would also be to store the python version a pickle was created with and issue a user warning when the pickle wasn't written with the same python version used to read it.

orbeckst commented 6 years ago

Is pickle support simply broken in Python, at least as far as 2/3 goes?

If datreant were able to make seamless pickle 2/3 work (e.g., by storing additional information, which would be easy in the datreant framework) then that would be very helpful and convenient. I would see this as an advantage of the datreant.data abstraction layer – it would take care of all the things that I don't want to deal with.

However, if you decide that there is no sane way to do it then restrict pickling just to, say, version 3. And a a big fat warning that it is absolutely not recommended to pickle anything because pickle obviously sucks as a portable archival format.

Perhaps one could use JSON serialization instead?

kain88-de commented 6 years ago

Is pickle support simply broken in Python, at least as far as 2/3 goes?

For custom types to some extend yes. But we can work around that a little bit.

If datreant were able to make seamless pickle 2/3 work (e.g., by storing additional information, which would be easy in the datreant framework) then that would be very helpful and convenient. I would see this as an advantage of the datreant.data abstraction layer – it would take care of all the things that I don't want to deal with.

Yes we can store extra data. The first I would like to store is the python version used to store the pickle. Then we can add a warning if the python versions don't match and reference docs that tell users what they can do to ensure they can still read the pickle.

However, if you decide that there is no sane way to do it then restrict pickling just to, say, version 3. And a a big fat warning that it is absolutely not recommended to pickle anything because pickle obviously sucks as a portable archival format.

I don't want to do this. There are still a lot of people on py2 that should be able to use all of datreant.data. A user also can't really choose when we pickel

Perhaps one could use JSON serialization instead?

User defined classes aren't json serializable by default. Pickle stlll seems to be a good option here.