conda-forge / julia-feedstock

A conda-smithy repository for julia.
BSD 3-Clause "New" or "Revised" License
23 stars 32 forks source link

Adding osx-arm support v3 #252

Open MilesCranmer opened 1 year ago

MilesCranmer commented 1 year ago

With a couple patches added to #224, to prevent infinite precompilation, and to output more debugging info.

conda-forge-webservices[bot] commented 1 year ago

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

MilesCranmer commented 1 year ago

@ngam @mkitti you both should be able to edit this at will.

mkitti commented 1 year ago

It's still on the 1.9 branch: https://github.com/JuliaLang/julia/blob/a7348b7aa9d99af5ce5a8314f58a690132f21fb9/stdlib/Profile/src/Profile.jl#L79

MilesCranmer commented 1 year ago

Yes, I added a local patch to this PR to fix it for 1.9

mkitti commented 1 year ago

Upstream removal of Precompile from the system image: https://github.com/JuliaLang/julia/pull/49132

mkitti commented 1 year ago

We might need to bring LD_LIBRARY_PATH back here to point the linker to the right binaries.

MilesCranmer commented 1 year ago

It's weird. It's still stuck on the Profile.@profile step for me, even though it is a fixed loop now...

Feel free to use this branch as your own by the way; I just made a new PR so I could push to it as well as I don't have access otherwise

MilesCranmer commented 1 year ago

Okay I backported https://github.com/JuliaLang/julia/pull/49132

MilesCranmer commented 1 year ago

Hm, I think the disable-testing-Baz and move-out-Profile patches are incompatible...

MilesCranmer commented 1 year ago

Manually backported; patch applies now.

MilesCranmer commented 1 year ago

@ngam if you run it now, the precompilation finish.

But I seem to have hit some other infinite loop when compiling ryu/utils.jl. Went away on its own...

MilesCranmer commented 1 year ago

Additional backports needed (?)

mkitti commented 1 year ago

Just to be clear, CI is not going to work here unless we have an aarch64 emulator. Even if we can fight through all the library linkage issues, the ultimate problem is that we have no way to retargeting Julia's code generation at the end of the day.

The best we can do is get Julia to build locally on a Mx Mac and upload the package.

The other option is to use juliaup.

MilesCranmer commented 1 year ago

Just to be clear, CI is not going to work here unless we have an aarch64 emulator.

Oh, I see. Could we use QEMU+Docker?

The best we can do is get Julia to build locally on a Mx Mac and upload the package.

I see. How hard is that to configure?

The other option is to use juliaup.

I didn't even know this is an option. I thought we had to build from source as a requirement of using conda-forge? If we can use juliaup, my vote would be for that. Would also give us windows compatibility for free.

ngam commented 1 year ago

We can petition to upload the artifacts manually. That’s okay. If this does indeed finish building and we are happy with its state. We can declare a victory and we will have osx-arm builds on anaconda dot org (pending approval from core). I will test tonight

ngam commented 1 year ago

@MilesCranmer what’s the state of this on your local M1/M2 machine? All good for your use cases?

mkitti commented 1 year ago

Oh, I see. Could we use QEMU+Docker?

https://apple.stackexchange.com/questions/420103/emulating-arm-m1-like-macos-on-an-x86-intel-mac

MilesCranmer commented 1 year ago

I still haven't finished a build. It stops at various points: sometimes it freezes at ryu/utils.jl, other times it gives me this error:

ryu/utils.jl
julia(57725,0x20d817a80) malloc: Heap corruption detected, free list is damaged at 0x600000c9f8e0
*** Incorrect guard value: 13448921666235138048
julia(57725,0x20d817a80) malloc: *** set a breakpoint in malloc_error_break to debug

[57725] signal (6): Abort trap: 6
in expression starting at ryu/utils.jl:338

another time I got through it, but it stopped somewhere else (I forget where).

mkitti commented 1 year ago

@MilesCranmer Could you try the canonical build just to see if that succeeds first?

  1. Download https://github.com/JuliaLang/julia/releases/download/v1.9.0/julia-1.9.0.tar.gz
  2. tar xvzf julia-1.9.0.tar.gz && cd julia-1.9.0
  3. make -j8
  4. ./julia
MilesCranmer commented 1 year ago

I haven't had issues before when building directly (including 1.9.0).

If it is a clue, when it freezes at ryu/utils.jl, and I quit, it says it is at the line ryu/utils.jl:338, which on 1.9.0 is:

const POW10_OFFSET_2, MIN_BLOCK_2, POW10_SPLIT_2 = generateinversetables()

It looks like generateinversetables is hitting an infinite while loop here:

function generateinversetables()
    POW10_OFFSET_2 = Vector{UInt16}(undef, 68 + 1)
    MIN_BLOCK_2 = fill(0xff, 68 + 1)
    POW10_SPLIT_2 = Tuple{UInt64, UInt64, UInt64}[]
    lowerCutoff = big(1) << (54 + 8)
    for idx = 0:67
        POW10_OFFSET_2[idx + 1] = length(POW10_SPLIT_2)
        i = 0
        while true
            v = ((big(10)^(9 * (i + 1)) >> (-(120 - 16 * idx))) % (big(10)^9) << (120 + 16))
            if MIN_BLOCK_2[idx + 1] == 0xff && ((v * lowerCutoff) >> 128) == 0
                i += 1
                continue
            end
            if MIN_BLOCK_2[idx + 1] == 0xff
                MIN_BLOCK_2[idx + 1] = i
            end
            v == 0 && break
            push!(POW10_SPLIT_2, ((v & BIG_MASK) % UInt64, ((v >> 64) & BIG_MASK) % UInt64, ((v >> 128) & BIG_MASK) % UInt64))
            i += 1
        end
    end
    POW10_OFFSET_2[end] = length(POW10_SPLIT_2)
    MIN_BLOCK_2[end] = 0x00

    return POW10_OFFSET_2, MIN_BLOCK_2, POW10_SPLIT_2
end

No clue why. Maybe it's some MPFR bug on macOS?? Are we using the latest MPFR?

mkitti commented 1 year ago

Julia 1.9.0 is expecting mpfr version 4.1.1: https://github.com/JuliaLang/julia/blob/v1.9.0/deps/mpfr.version

conda-forge is supplying MPFR 4.2.0.

MilesCranmer commented 1 year ago

Thanks.

Weirdly even after I set it to 4.1.1 in the .ci_support/<OS>_.yaml files, I see:

    metis:             5.1.0-h9f76cd9_1006        conda-forge
    mpfr:              4.2.0-he09a6ba_0           conda-forge
    ncurses:           6.3-h07bb92c_1             conda-forge

when running conda-build. I'm not sure why. Do I need to fix the version in meta.yaml too?

MilesCranmer commented 1 year ago

Getting a headache from these build errors so going to step away from a bit... feel free to push to my branch!

ngam commented 1 year ago

@MilesCranmer no rush! Some quick feedback: we mostly only edit meta.yaml and build.sh in these feedstocks. If you're editing something else, you're likely doing something way too advanced. Hence, your mpfr edit didn't go through. I will push an edit to fix it for you.

I will try to trigger a build locally once I edit the PR.

ngam commented 1 year ago

@conda-forge-admin, please rerender.

ngam commented 1 year ago

Update: This is the farthest I got this build locally ever.

Precompilation complete. Summary:
Generation ──  52.788396 seconds 75.4387%
Execution ───  17.186865 seconds 24.5613%
Total ───────  69.975261 seconds
Outputting sysimage file...
github-actions[bot] commented 1 year ago

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/julia-feedstock/actions/runs/5151721471.

ngam commented 1 year ago

Congratulations everyone, Julia builds fine natively with conda-forge.

Sadly, the build is broken for now.

+ julia -e 'using InteractiveUtils; InteractiveUtils.versioninfo()'
.../conda-bld/julia_1685678236219/test_tmp/run_test.sh: line 7: 56566 Killed: 9               julia -e 'using InteractiveUtils; InteractiveUtils.versioninfo()'
ngam commented 1 year ago

The broken Julia osx-arm build is available at https://anaconda.org/ngam/julia if you'd like to test it (conda install Julia -c ngam -c conda-forge)

ngam commented 1 year ago

Okay, actually good news!

The only issue we have is a signature issue. See https://github.com/JuliaLang/julia/issues/44502. Locally, I fixed it with codesign --force -s - <path-to-libjulia> which is codesign --force -s - $CONDA_PREFIX/lib/libjulia.1.dylib

ngam commented 1 year ago
(test) ~$ julia -v                                                            
julia version 1.9.0
(test) ~$ julia   
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.0 (2023-05-07)
 _/ |\__'_|_|_|\__'_|  |  https://github.com/conda-forge/julia-feedstock
|__/                   |

julia> 
(test) ~$ julia -e 'using InteractiveUtils; InteractiveUtils.versioninfo()'
Julia Version 1.9.0
Commit 8e63055292* (2023-05-07 11:25 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin20.0.0)
  CPU: 10 × Apple M1 Max
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 1 on 8 virtual cores
Environment:
  JULIA_DEPOT_PATH_BACKUP = 
  JULIA_PROJECT_BACKUP = 
  JULIA_LOAD_PATH_BACKUP = 
  JULIA_DEPOT_PATH = /Users/ngam/.micromamba/envs/test/share/julia:
  JULIA_PROJECT = @test
  JULIA_LOAD_PATH = @:@test:@stdlib
  JULIA_SSL_CA_ROOTS_PATH_BACKUP = 
  JULIA_CONDAPKG_BACKEND_BACKUP = 
  JULIA_CONDAPKG_BACKEND = System
  JULIA_CONDAPKG_EXE_BACKUP = 
  JULIA_CONDAPKG_EXE = 
(test) ~$ 
ngam commented 1 year ago

We likely need something like to fix this

if [[ "${target_platform}" == "osx-arm64" ]]; then
/usr/bin/codesign -s - -f ${OUTPUT}
fi
ngam commented 1 year ago

I will try to clean up this PR over the weekend and see what we really need to do get osx-arm working. There're a lot of pieces floating around in this PR. We are very close. Good job, @mkitti and @MilesCranmer for being patient and persistent!

ngam commented 1 year ago

@MilesCranmer if you could do us a favor by testing the "broken" build I have with a real application, that will be great. Here are the steps on osx-arm:

  1. create a clean new environment (conda create -n test) and activate it (conda activate test)
  2. install Julia from ngam's channel, conda install Julia -c ngam -c conda-forge
  3. codesign --force -s - $CONDA_PREFIX/lib/libjulia.1.dylib
  4. test

This build passes the random tests I tried, so I think it's all good.

MilesCranmer commented 1 year ago

Hm, it seems to get killed for me: (See next comment)

> julia
[1]    82082 killed     julia

More info:

> otool -L /Users/mcranmer/mambaforge/envs/julia/bin/julia
/Users/mcranmer/mambaforge/envs/julia/bin/julia:
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
        @rpath/libjulia.1.dylib (compatibility version 1.0.0, current version 1.9.0)

> nm /Users/mcranmer/mambaforge/envs/julia/bin/julia
0000000100008010 d __dyld_private
0000000100000000 T __mh_execute_header
                 U _exit
                 U _jl_load_repl
0000000100003f60 T _main
                 U dyld_stub_binder
MilesCranmer commented 1 year ago

Wait, I was being dumb, sorry. I had uninstalled and reinstalled and forgot to re-run the codesign. It works for me now!

> julia       
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.9.0 (2023-05-07)
 _/ |\__'_|_|_|\__'_|  |  https://github.com/conda-forge/julia-feedstock
|__/                   |

julia> versioninfo()
Julia Version 1.9.0
Commit 8e63055292* (2023-05-07 11:25 UTC)
Platform Info:
  OS: macOS (arm64-apple-darwin20.0.0)
  CPU: 8 × Apple M1 Pro
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, apple-m1)
  Threads: 6 on 6 virtual cores
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_PKG_PRECOMPILE_AUTO = 0
  JULIA_DEPOT_PATH_BACKUP = 
  JULIA_PROJECT_BACKUP = 
  JULIA_LOAD_PATH_BACKUP = 
  JULIA_DEPOT_PATH = /Users/mcranmer/mambaforge/envs/julia/share/julia:
  JULIA_PROJECT = @julia
  JULIA_LOAD_PATH = @:@julia:@stdlib
  JULIA_SSL_CA_ROOTS_PATH_BACKUP = 
  JULIA_CONDAPKG_BACKEND_BACKUP = 
  JULIA_CONDAPKG_BACKEND = System
  JULIA_CONDAPKG_EXE_BACKUP = 
  JULIA_CONDAPKG_EXE = /Users/mcranmer/mambaforge/bin/conda
MilesCranmer commented 1 year ago

I'm running the integration tests of SymbolicRegression.jl which do a bunch of heavy multiprocessing. Should be a good test if there are any segfaults lurking somewhere.

mkitti commented 1 year ago

Do Base.runtests() if you have time.

MilesCranmer commented 1 year ago

Tests are still running for SymbolicRegression.jl; I wonder if some library got installed as the Rosetta version (I noticed openspecfun doesn't have an aarch64 version). Currently I see this:

[83194] signal (10.1): Bus error: 10
in expression starting at /Users/mcranmer/.julia/packages/SymbolicRegression/Y57Eu/test/test_tree_construction.jl:11

I'm not sure if it's fatal yet.

MilesCranmer commented 1 year ago

I'm seeing some Bus errors on the LinearAlgebra.jl tests as well:

LinearAlgebra/dense                              (4) |        started at 2023-06-02T14:40:27.081
      From worker 4:
      From worker 4:    [91499] signal (10.1): Bus error: 10
      From worker 4:    in expression starting at /Users/mcranmer/mambaforge/envs/julia/share/julia/stdlib/v1.9/LinearAlgebra/test/dense.jl:273

(This is the "Tests norms" test set)

mkitti commented 1 year ago

I wonder if we run those tests in isolation if this is reproducible. My guess is that this might only occur if this is run as part of the larger test suite.