JuliaIO / MAT.jl

Julia module for reading MATLAB files
MIT License
278 stars 71 forks source link

Updated for Julia v1.0 #101

Closed el-oso closed 5 years ago

el-oso commented 5 years ago

I updated MAT.jl for it to work with Julia v1.0. Have been using it for a few weeks and seems to be ok.

yuyichao commented 5 years ago

Dup of other ones.

el-oso commented 5 years ago

I understand, it's a duplicate. But vanilla MAT.jl doesn't work with v1.0. Therefore, please fix it or pull the fixes from someone else.

yuyichao commented 5 years ago

No one has a complete fix yet. And as I mentioned in https://github.com/JuliaIO/MAT.jl/pull/100 that is now merged as https://github.com/JuliaIO/MAT.jl/pull/102, I have no problem merging partial fix as long as they don't break anything else (and also don't introduce any additional copying).

el-oso commented 5 years ago

Well, I just pushed last changes to my repository clone in github. It now passes all tests now in 0.7 and 1.0.1. (see below)

If you are interested let me know and I will send you a pull request. There is a conflict with #100 due to style differences. But you can cherry pick if you like.

Regards, Jorge

Testing MAT Resolving package versions... Status /tmp/tmp7S1hCT/Manifest.toml [9e28174c] BinDeps v0.8.10 [b99e7846] BinaryProvider v0.5.2 [a74b3585] Blosc v0.5.1 [e1450e63] BufferedStreams v1.0.0 [631607c0] CMake v1.1.0 [d5fb7624] CMakeWrapper v0.2.2 [34da2185] Compat v1.3.0 [864edb3b] DataStructures v0.14.0 [f67ccb44] HDF5 v0.10.2 [0862f596] HTTPClient v0.2.1 [d9be37ee] Homebrew v0.7.0 [682c06a0] JSON v0.19.0 [b27032c2] LibCURL v0.4.1 [522f3ed2] LibExpat v0.5.0 [2ec943e9] Libz v1.0.0 [23992714] MAT v0.4.0+ [~/.julia/dev/MAT] [bac558e1] OrderedCollections v1.0.2 [d96e819e] Parameters v0.10.1 [30578b45] URIParser v0.4.0 [c17dfb99] WinRPM v0.4.2 [2a0f44e3] Base64 [@stdlib/Base64] [ade2ca70] Dates [@stdlib/Dates] [8bb1440f] DelimitedFiles [@stdlib/DelimitedFiles] [8ba89e20] Distributed [@stdlib/Distributed] [b77e0a4c] InteractiveUtils [@stdlib/InteractiveUtils] [76f85450] LibGit2 [@stdlib/LibGit2] [8f399da3] Libdl [@stdlib/Libdl] [37e2e46d] LinearAlgebra [@stdlib/LinearAlgebra] [56ddb016] Logging [@stdlib/Logging] [d6f4376e] Markdown [@stdlib/Markdown] [a63ad114] Mmap [@stdlib/Mmap] [44cfe95a] Pkg [@stdlib/Pkg] [de0858da] Printf [@stdlib/Printf] [3fa0cd96] REPL [@stdlib/REPL] [9a3f8284] Random [@stdlib/Random] [ea8e919c] SHA [@stdlib/SHA] [9e88b42a] Serialization [@stdlib/Serialization] [1a1011a3] SharedArrays [@stdlib/SharedArrays] [6462fe0b] Sockets [@stdlib/Sockets] [2f01184e] SparseArrays [@stdlib/SparseArrays] [10745b16] Statistics [@stdlib/Statistics] [8dfed614] Test [@stdlib/Test] [cf7118a7] UUIDs [@stdlib/UUIDs] [4ec0a83e] Unicode [@stdlib/Unicode] Checking reading v6 format Checking reading v7 format Checking reading v7.3 format Testing MAT tests passed

On Tue, Nov 6, 2018 at 11:35 PM Yichao Yu notifications@github.com wrote:

No one has a complete fix yet. And as I mentioned in #100 https://github.com/JuliaIO/MAT.jl/pull/100 that is now merged as #102 https://github.com/JuliaIO/MAT.jl/pull/102, I have no problem merging partial fix as long as they don't break anything else (and also don't introduce any additional copying).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaIO/MAT.jl/pull/101#issuecomment-436434308, or mute the thread https://github.com/notifications/unsubscribe-auth/AOki7w0KuIHuvQmu9ucTPoynlL3zu1EFks5usg6lgaJpZM4YPVCK .

yuyichao commented 5 years ago

It has all the same issues as https://github.com/JuliaIO/MAT.jl/pull/86 .

el-oso commented 5 years ago

Hmmm... Forgive me but I don't see the resemblance. All the re-interpreted arrays were converted to Array. I didn't use unsafe_wrap, change it to codeunits. I didn't touch reference data except to update it to newer functions like speye to SparseCSC constructor.

On Thu, Nov 8, 2018 at 10:58 PM Yichao Yu notifications@github.com wrote:

It has all the same issues as #86 https://github.com/JuliaIO/MAT.jl/pull/86 .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/JuliaIO/MAT.jl/pull/101#issuecomment-437171152, or mute the thread https://github.com/notifications/unsubscribe-auth/AOki7zAaEBhOpdAKBDjzFnoDhWXQCnZAks5utKkbgaJpZM4YPVCK .

yuyichao commented 5 years ago

The absense of reinterpreted array (edit: I mean the way you get rid of them) is exactly the issue. It's the copying that I'm talking about and what's what I mentioned above too.

el-oso commented 5 years ago

ok, I see what you mean now. The performance is highly impacted because convert() makes a copy (see below)

julia> @btime reinterpret(UInt8, x);
  57.207 ns (1 allocation: 32 bytes)

julia> @btime convert(Array, reinterpret(UInt8, x));
  1.325 μs (2 allocations: 208 bytes)

And it's not a small hit, in this example is an order of magnitude ( ~23X ).

Although, I disagree with your categorical denial of accepting any changes for 1.0.x, now I understand your concern. Seems that you know more about this, but will see if I can fix it.

I think that a slow package is better than a broken package though.

yuyichao commented 5 years ago

I think that a slow package is better than a broken package though.

I can accept a guarantee that the slowness to be fixed within a certain deadline together with a list of places to fix. If the deadline is not met then the change will be reverted and there shouldn't be a release before the fix. Otherwise, I don't think it'll ever happen.

your categorical denial of accepting any changes for 1.0.x

And no, I never said that. If your change doesn't break 0.6 and doesn't cause extra copy, go ahead and submit it and I have no problem merging it. That's exactly why I merged https://github.com/JuliaIO/MAT.jl/pull/100 (The changes I requested were not fully fixed so I even fixed them myself after waiting for a week.)

el-oso commented 5 years ago

Ok, after some digging I think that the culprit (mainly) is that reinterpret() was changed in v1.0 and returns a new type ReinterpretArray which is an AbstractArray, which is a View of the array. Solution spaces that I can think of are the following:

  1. Change the reference data such that I has the same type as Julia version (Note: I read from another thread that you are not in favor of such direction, but I have to mention it)
  2. Make our own reinterpret() function that return Array types instead of using the generic function. This might deviate further in future versions of Julia.
  3. Create our own array Type and redefine reinterpret()such that it returns the same Type always, e.g. MATArray or MXArray or whatever your prefer, regardless of Julia version, not sure if we can use the generic reinterpret() from Julia, but probably we can. This also needs an update of the reference data, but at least will be consistent.
  4. Another way is to force it to be an Array. It incurs a penalty, but is not as severe as with convert()
@btime Array(reinterpret(UInt8,x));
  497.876 ns (2 allocations: 208 bytes)

I am in favor of solution 3 but before I write anything I wanted to check up with you if you agree upon this and/or check what are your suggestions.

briochemc commented 5 years ago

I'm sure I am not getting the whole picture here, but I'd vote for @el-oso's solution 2. I don't really understand what solution 1 is, and solution 3 introducing a new type might add an extra layer of work for the package users. (When I load a MATLAB array into Julia using MAT, I would personally prefer it to be an Array right away, rather than it being a MATArray that I may need to convert to an Array before I can use it in my Julia code. But maybe that's just me?)

yuyichao commented 5 years ago

2 is imposible.