illinois-ceesd / mirgecom

MIRGE-Com is the workhorse simulation application for the Center for Exascale-Enabled Scramjet Design at the University of Illinois.
Other
12 stars 19 forks source link

"Killed: 9" error with clang-12/ld64-609 on Mac-arm64 #451

Open matthiasdiener opened 3 years ago

matthiasdiener commented 3 years ago

On a fresh emirge installation on Mac M1, running mirgecom might result in the following type of error:

$ python wave.py --lazy
wave.py:104: DeprecationWarning: EagerDGDiscretization is deprecated and will go away in 2022. Use the base DiscretizationCollection with grudge.op instead.
  discr = EagerDGDiscretization(actx, mesh, order=order)
Killed: 9

A workaround is to install an older ld version with conda:

$ conda install 'ld64<609'
inducer commented 3 years ago

Nice job finding the issue and a workaround! Do you happen to have a link to the underlying LLVM (I assume?) issue?

matthiasdiener commented 3 years ago

Nice job finding the issue and a workaround! Do you happen to have a link to the underlying LLVM (I assume?) issue?

I really don't know what is going on - it seems to crash depending on memory usage (ie, higher memory usage-> higher likelihood of crash)? It even kills the debugger; I haven't seen anything like it:

$ lldb -- python examples/wave.py
(lldb) target create "python"
Current executable set to 'python' (arm64).
(lldb) settings set -- target.run-args  "examples/wave.py"
(lldb) r
Process 31085 launched: '/Users/mdiener/Work/emirge_lazy/miniforge3/envs/ceesd_lazy/bin/python' (arm64)
examples/wave.py:90: DeprecationWarning: Configuring the PyOpenCLArrayContext to return host scalars from reductions is deprecated. To configure the PyOpenCLArrayContext to return device scalars, pass 'force_device_scalars=True' to the constructor. Support for returning host scalars will be removed in 2022.
  actx = PyOpenCLArrayContext(queue,
examples/wave.py:104: DeprecationWarning: EagerDGDiscretization is deprecated and will go away in 2022. Use the base DiscretizationCollection with grudge.op instead.
  discr = EagerDGDiscretization(actx, mesh, order=order)
Killed: 9
$ 

Edit: https://github.com/tpoechtrager/cctools-port/issues/104 shows a similar error.

matthiasdiener commented 2 years ago

The original issue has been fixed by removing ld64-609: https://github.com/conda-forge/admin-requests/pull/324. See also https://github.com/conda-forge/cctools-and-ld64-feedstock/issues/44.

Currently, there is a similar issue with pocl-1.8 on Mac M1, which can be worked around by installing pocl-1.7:

$ conda install pocl==1.7