emanuele / convert_matlab73_hdf5

Convert Matlab v7.3 '.mat' files (i.e. HDF5 file format) into Python's pickle or numpy format.
34 stars 26 forks source link

Smarter string heuristic #1

Open pao opened 12 years ago

pao commented 12 years ago

Just a drive-by; I'm reverse engineering the MATLAB HDF5 myself. The uint16 character strings all appear to have the HDF5 attribute MATLAB_class = char, so you can probably key on that.

emanuele commented 12 years ago

Hi Pao,

Sorry for the late reply and thank you for the hint. Please let me know when you have progress on the reverse engineering. As far as I see the problem is to be robust enough for the various use (and misuse? ;)) of that format by users.

Did you find this preliminary attempt of mine of some use for your case?

On 10/24/2012 09:43 PM, pao wrote:

Just a drive-by; I'm reverse engineering the MATLAB HDF5 myself. The uint16 character strings all appear to have the HDF5 attribute |MATLAB_class = char|, so you can probably key on that.

— Reply to this email directly or view it on GitHub https://github.com/emanuele/convert_matlab73_hdf5/issues/1.

pao commented 12 years ago

Unfortunately no on both counts. I really couldn't justify spending any more time on it, and there's some crazy stuff going on with object serialization. The data is all there, but I couldn't connect the dots from the entry point in the data hierarchy.

jim-rafferty commented 10 years ago

Hi Emanuele and Pao,

I wrote up a tool to do what this project does before I was aware it existed, and I believe I have a solution for the string heuristic that works for the mat files I've tested it on:

If you check f.attrs.keys() for the presence of the string 'MATLAB_int_decode' that will tell you whether the variable is a string or not. If the variable is a uint16, this key is not present at all.

Best,

Jim.

emanuele commented 10 years ago

Hi Jim,

Thanks a lot for the hint! I am not working on this project since a while. You are welcome to submit a pull request in order to fix the issue. Nevertheless, I'll have a look to the issue and try to fix it myself if you don't. By the way, do you have code to share from your tool?

Best,

Emanuele

jim-rafferty commented 10 years ago

No problem :) I've tried submitting a pull request but I'm not 100% certain I've done it correctly as I am new to github. I've committed the proposed change to a fork of your repo in any case.

J.