datajoint / mym

MySQL API for MATLAB with support for BLOB objects
Other
8 stars 17 forks source link

added 32bitflag for data inserted from old myms #88

Closed Alvalunasan closed 2 years ago

Alvalunasan commented 2 years ago

Addition of 32bit flag and functions for old mym support

dimitri-yatsenko commented 2 years ago

I believe that the current version of mym will serialize blobs the same way whether or not the platform is 64-bit or 32-bit.

Do you know which versions of mym were sensitive to the architecture?

Alvalunasan commented 2 years ago

I believe that the current version of mym will serialize blobs the same way whether or not the platform is 64-bit or 32-bit.

Do you know which versions of mym were sensitive to the architecture?

I know Brodylab has been using mym before Matlab 64 bits existed. I don't know about specific versions but I think they keep a 32bit only version with few additions.

dimitri-yatsenko commented 2 years ago

Discussed this further. mym serializes data differently depending on whether it has been compiled by a 64-bit or 32-bit compiler. Things like array dimensions and blob lengths are encoded as default integers, which differs between the compilers. We haven't run into this issue because we have been using 64-bit compiled versions all along. The Python version of the blob serialization does not have this issue because we explicitly specify the size of each integer.

So the question arises how to deal with the blobs encoded by the 32-bit compiled mym. In DataJoint python, we added the switch to allow switching to 32-bit for decoding blobs. https://github.com/datajoint/datajoint-python/pull/995 It will never use it for encoding.

Alvalunasan commented 2 years ago

I think the solution should be similar to the python one. The user would need to set a 32bitflag variable and:

dimitri-yatsenko commented 2 years ago

FYI: Depending on the version of mym, some integers may be explicitly set to 64 bits regardless of the compiler. In DataJoint's current version of mym I think there is only one spot where sizeof(int) is used for encoding. The rest of the code has been switched to explicit bitsizes sizeof(_uint64).

Alvalunasan commented 2 years ago

Yes, what I think was the origin of the problem is that brodylab mym version mimics array dimension encoding (with 32 bits) to keep storing the blobs "the same way" they used to do it before 64bit MATLAB existed. So this mym version has diverged from "official" mym in that regard. Here they explain the situation: https://github.com/Brody-Lab/mym-debugging#what-was-the-problem-and-whats-the-solution Also if it is needed to check brodylab mym code: https://github.com/Brody-Lab/ExperPort/blob/master/MySQLUtility/mym.cpp

guzman-raphael commented 2 years ago

@Alvalunasan Thanks for sharing this. Now that we've merged and released #90 (2.8.5), I think we can safely say that it has been addressed. Please see here for an example of how you can use the feature.

Closing this but feel free to file a new issue if you experience further issues.