JuliaGeodynamics / LaMEM.jl

Julia interface to LaMEM (Lithosphere and Mantle Evolution Model)
GNU General Public License v3.0
24 stars 12 forks source link

fail to test LaMEM.jl on mac #56

Closed wenrongcao closed 2 months ago

wenrongcao commented 2 months ago

Hi Boris, after running LaMEM.jl 0.3.4 without any problem in teaching and research last month (last running on AWS on April 12). I encountered an issue to pass testing LaMEM.jl today on my Macbook. My student using Linux (Linux installed on top of Windows, also using LaMEM.jl 0.3.4), however, told me everything is normal and he can run 8 cores without issues.

I am wondering if you could take a look at them when you have time. Thanks much!

I am using Julia 1.10, LaMEM 0.3.4, GeophysicalModelGenerator 0.7.1 and Macbook Pro Intel chip (same error on Macbook using M2 chip, I tested). It gives the following error when running the FallingBlock_DirectSolver test.

================================= STEP 1 =================================
--------------------------------------------------------------------------
Current time        : 0.00000000 [ ] 
Tentative time step : 10.00000000 [ ] 
--------------------------------------------------------------------------
  0 SNES Function norm 7.310266752999e+01 
  0 PICARD ||F||/||F0||=1.000000e+00 
  ** Instance Error 1 in DMUMPS_F77           0
application called MPI_Abort(MPI_COMM_WORLD, -99) - process 0
run LaMEM: Error During Test at /Users/wenrongcao/.julia/packages/LaMEM/bw6yg/test/runLaMEM.jl:4
  Got exception outside of a @test
  failed process: Process(setenv(`/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/bin/mpiexec -n 2 /Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/bin/LaMEM -ParamFile /Users/wenrongcao/.julia/packages/LaMEM/bw6yg/test/input_files/FallingBlock_DirectSolver.dat '-nstep_max 5'`,["VECLIB_MAXIMUM_THREADS=1", "OMP_NUM_THREADS=1", "DYLD_FALLBACK_LIBRARY_PATH=/Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia:/Users/wenrongcao/.julia/artifacts/023157501199a753608d8f4adf38a1147a2ad00e/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int32_Real64/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real32/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real64/lib:/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/lib:/Users/wenrongcao/.julia/artifacts/0a1cd9a580e8512726310b688d28da27ddbbfb14/lib:/Users/wenrongcao/.julia/artifacts/ffdee4f2c5c1a970450976825cd6df5b97916b5d/lib:/Users/wenrongcao/.julia/artifacts/5ee266c77972e985adc3ed40e62e00a3e058ab5a/lib:/Users/wenrongcao/.julia/artifacts/0233bb40b298b03aa3743cc339b4a5c6816ce583/lib:/Users/wenrongcao/.julia/artifacts/420fc8fcf6f318e7c8ea117a4e462931d7192a97/lib:/Users/wenrongcao/.julia/artifacts/6c1504d3361ef1e0869478537aea89031a2565fb/lib:/Users/wenrongcao/.julia/artifacts/c51b54c5cf307066eb61f3bca3a2c9158488955a/lib:/Users/wenrongcao/.julia/artifacts/e078c73a851da78baff82236c028f23f7a364cc4/lib:/Users/wenrongcao/.julia/artifacts/b9cd98702104db1408772a74c514fc9466ab86e4/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64_deb/lib:/Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/lib:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib/julia:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib"]), ProcessExited(157)) [157]

I also tested TM_Subduction_example.jl. Using 1 core and multiple cores both give errors. But errors seem to be different: Error when using 1 core:

--------------------------------------------------------------------------
============================== INITIAL GUESS =============================
--------------------------------------------------------------------------
  0 SNES Function norm 7.682568809981e+00 
  0 PICARD ||F||/||F0||=1.000000e+00 
At line 5652 of file dana_driver.F (unit = 10)
Fortran runtime error: Cannot open file '': No such file or directory
ERROR: failed process: Process(setenv(`/Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/bin/LaMEM -ParamFile output.dat ''`,["XPC_FLAGS=0x0", "COMMAND_MODE=unix2003", "PATH=/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/bin:/Users/wenrongcao/.julia/artifacts/5ee266c77972e985adc3ed40e62e00a3e058ab5a/bin:/Users/wenrongcao/.julia/artifacts/420fc8fcf6f318e7c8ea117a4e462931d7192a97/bin:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/bin:/Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/bin:/usr/bin:/bin:/usr/sbin:/sbin", "PWD=/Users/wenrongcao/Documents/julia_codes", "ZES_ENABLE_SYSMAN=1", "XPC_SERVICE_NAME=application.com.microsoft.VSCode.91155536.91155542", "TERM_PROGRAM=vscode", "VSCODE_GIT_ASKPASS_NODE=/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper (Plugin).app/Contents/MacOS/Code Helper (Plugin)", "SHELL=/bin/bash", "VSCODE_GIT_ASKPASS_MAIN=/Applications/Visual Studio Code.app/Contents/Resources/app/extensions/git/dist/askpass-main.js"  …  "FONTCONFIG_FILE=/Users/wenrongcao/.julia/artifacts/2b17f8eb5c0167b92ee1ef185a98606e9d27b75e/etc/fonts/fonts.conf", "OPENBLAS_DEFAULT_NUM_THREADS=1", "USER=wenrongcao", "JULIA_EDITOR=code", "HOME=/Users/wenrongcao", "TERM=xterm-256color", "TERM_PROGRAM_VERSION=1.89.1", "JULIA_NUM_THREADS=", "COLORTERM=truecolor", "OPENBLAS_MAIN_FREE=1"]), ProcessExited(2)) [2]

Error when using 2 cores:

ERROR: failed process: Process(setenv(`/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/bin/mpiexec -n 2 /Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/bin/LaMEM -ParamFile output.dat ''`,["VECLIB_MAXIMUM_THREADS=1", "OMP_NUM_THREADS=1", "DYLD_FALLBACK_LIBRARY_PATH=/Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia:/Users/wenrongcao/.julia/artifacts/023157501199a753608d8f4adf38a1147a2ad00e/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int32_Real64/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real32/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real64/lib:/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/lib:/Users/wenrongcao/.julia/artifacts/0a1cd9a580e8512726310b688d28da27ddbbfb14/lib:/Users/wenrongcao/.julia/artifacts/ffdee4f2c5c1a970450976825cd6df5b97916b5d/lib:/Users/wenrongcao/.julia/artifacts/5ee266c77972e985adc3ed40e62e00a3e058ab5a/lib:/Users/wenrongcao/.julia/artifacts/0233bb40b298b03aa3743cc339b4a5c6816ce583/lib:/Users/wenrongcao/.julia/artifacts/420fc8fcf6f318e7c8ea117a4e462931d7192a97/lib:/Users/wenrongcao/.julia/artifacts/6c1504d3361ef1e0869478537aea89031a2565fb/lib:/Users/wenrongcao/.julia/artifacts/c51b54c5cf307066eb61f3bca3a2c9158488955a/lib:/Users/wenrongcao/.julia/artifacts/e078c73a851da78baff82236c028f23f7a364cc4/lib:/Users/wenrongcao/.julia/artifacts/b9cd98702104db1408772a74c514fc9466ab86e4/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64_deb/lib:/Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/lib:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib/julia:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib"]), ProcessExited(63)) [63]

Finally, I tested the subduction_example.ipynb. I CAN run it using 1 core. But using 2 cores, it gives an error (seems to be MPI related):

failed process: Process(setenv(`/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/bin/mpiexec -n 2 /Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/bin/LaMEM -ParamFile output.dat ''`,["VECLIB_MAXIMUM_THREADS=1", "OMP_NUM_THREADS=1", "DYLD_FALLBACK_LIBRARY_PATH=/Applications/Julia-1.10.app/Contents/Resources/julia/lib/julia:/Users/wenrongcao/.julia/artifacts/023157501199a753608d8f4adf38a1147a2ad00e/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int32_Real64/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real32/lib:/Users/wenrongcao/.julia/artifacts/085281efe66fd732117b891040ff6132937ea29b/lib/metis/metis_Int64_Real64/lib:/Users/wenrongcao/.julia/artifacts/93cc0370456d787d4312cc7b29098924b8ffbef9/lib:/Users/wenrongcao/.julia/artifacts/0a1cd9a580e8512726310b688d28da27ddbbfb14/lib:/Users/wenrongcao/.julia/artifacts/ffdee4f2c5c1a970450976825cd6df5b97916b5d/lib:/Users/wenrongcao/.julia/artifacts/5ee266c77972e985adc3ed40e62e00a3e058ab5a/lib:/Users/wenrongcao/.julia/artifacts/0233bb40b298b03aa3743cc339b4a5c6816ce583/lib:/Users/wenrongcao/.julia/artifacts/420fc8fcf6f318e7c8ea117a4e462931d7192a97/lib:/Users/wenrongcao/.julia/artifacts/6c1504d3361ef1e0869478537aea89031a2565fb/lib:/Users/wenrongcao/.julia/artifacts/c51b54c5cf307066eb61f3bca3a2c9158488955a/lib:/Users/wenrongcao/.julia/artifacts/e078c73a851da78baff82236c028f23f7a364cc4/lib:/Users/wenrongcao/.julia/artifacts/b9cd98702104db1408772a74c514fc9466ab86e4/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/single_real_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_complex_Int64/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int32/lib:/Users/wenrongcao/.julia/artifacts/8e71fdc5590196691aedbfee700bc6f49297f160/lib/petsc/double_real_Int64_deb/lib:/Users/wenrongcao/.julia/artifacts/76a58b702968ecc4d175e67ffb81234506ee61a9/lib:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib/julia:/Applications/Julia-1.10.app/Contents/Resources/julia/bin/../lib"]), ProcessExited(157)) [157]
boriskaus commented 2 months ago

thanks for reporting. The reason for this appears to be that MUMPS_jll was upgraded from version 5.6.2 to 5.7.0. PETSc_jll was build vs. 5.6.2 which thus breaks all tests that use mumps (partly my mistake as I did not FIX the version of packages while building PETSc, but foolishly only indicated the lower bound of versions).

wenrongcao commented 2 months ago

Thanks! I updated LaMEM.jl to 0.3.5 and the code is working. The only issue is that TM_Subdcution_example needs permission to read and open files, which causes an error. The code itself seems to run ok if testing is not included.

Writing LaMEM marker file -> ./markers/mdb.00000000.dat
TM_Subduction_example: Error During Test at /Users/wenrongcao/.julia/packages/LaMEM/TmPTE/test/test_examples.jl:9
  Got exception outside of a @test
  LoadError: SystemError: opening file "./markers/mdb.00000000.dat": Permission denied

At the end of all tests:

Test Summary:           | Pass  Error  Total     Time
examples in /scripts    |    4      1      5  5m13.9s
  TM_Subduction_example |           1      1  1m08.8s
  Subduction3D          |    2             2  2m46.3s
  StrengthEnvelop       |    2             2  1m18.8s
ERROR: LoadError: Some tests did not pass: 4 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /Users/wenrongcao/.julia/packages/LaMEM/TmPTE/test/test_examples.jl:5
in expression starting at /Users/wenrongcao/.julia/packages/LaMEM/TmPTE/test/runtests.jl:12
ERROR: Package LaMEM errored during testing
boriskaus commented 2 months ago

tests on windows also still fail because the code runs too long, so its not fully fixed yet. I did encounter the file permission issue before (but how was that solved again...). But yes, the only thing I did is fixed MUMPS_jll to 5.6.2

boriskaus commented 2 months ago

I solved some of the issues with testing on windows. Can you try again to see if it now works for you?

wenrongcao commented 2 months ago

Updated to LaMEM.jl 0.3.6, testing LaMEM still gives a file permission error on Macbook. I don't have a Windows PC at hand right now so I cannot test on windows.

Writing LaMEM marker file -> ./markers/mdb.00000000.dat
TM_Subduction_example: Error During Test at /Users/wenrongcao/.julia/packages/LaMEM/kjlav/test/test_examples.jl:10
  Got exception outside of a @test
  LoadError: SystemError: opening file "./markers/mdb.00000000.dat": Permission denied
  Stacktrace:
Test Summary:           | Pass  Error  Total     Time
examples in /scripts    |    4      1      5  1m17.4s
  TM_Subduction_example |           1      1    23.4s
  Subduction3D          |    2             2    14.8s
  StrengthEnvelop       |    2             2    39.2s
ERROR: LoadError: Some tests did not pass: 4 passed, 0 failed, 1 errored, 0 broken.
in expression starting at /Users/wenrongcao/.julia/packages/LaMEM/kjlav/test/test_examples.jl:5
in expression starting at /Users/wenrongcao/.julia/packages/LaMEM/kjlav/test/runtests.jl:12
ERROR: Package LaMEM errored during testing
boriskaus commented 2 months ago

I think I now created a fix for the failing tests. can you please test the main branch of LaMEM on your machine?

pkg> rm LaMEM
pkg> add LaMEM#main
pkg> test LaMEM

If it works for you, I'll create a new release.

wenrongcao commented 2 months ago

Using LaMEM#main, all tests are passed without issues:

--------------------------------------------------------------------------
Test Summary:        | Pass  Total     Time
examples in /scripts |    6      6  1m20.4s
     Testing LaMEM tests passed 
boriskaus commented 2 months ago

Thanks for reporting!

boriskaus commented 2 months ago

Ok this should now be fixed in version 0.3.7, so I’ll close this issue