illinois-ceesd / emirge

Environment for MirgeCom
MIT License
4 stars 3 forks source link

Seg Fault at Startup #103

Open dshtey2 opened 3 years ago

dshtey2 commented 3 years ago

I installed a fresh copy of emirge onto my subshell, and when I tried to run the example files I kept getting Seg Faults, akin to the following. I am running Linux on a WSL subshell on my Acer laptop.

Choose platform:
[0] <pyopencl.Platform 'Portable Computing Language' at 0x7fcc00dd2008>
Choice [0]:
[ACER:08543] *** Process received signal ***
[ACER:08543] Signal: Segmentation fault (11)
[ACER:08543] Signal code: Address not mapped (1)
[ACER:08543] Failing at address: (nil)
[ACER:08543] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x128a0)[0x7fcc046ee8a0]
[ACER:08543] [ 1] /lib/x86_64-linux-gnu/libc.so.6(__isoc99_fscanf+0x6d)[0x7fcc039ba2bd]
[ACER:08543] [ 2] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/libpocl.so.2.6.0(+0x6b117)[0x7fcc00d5d117]
[ACER:08543] [ 3] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/pocl/libpocl-devices-pthread.so(pocl_pthread_init+0xc3)[0x7fcc001812a3]
[ACER:08543] [ 4] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/libpocl.so.2.6.0(pocl_init_devices+0x699)[0x7fcc00d58479]
[ACER:08543] [ 5] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/libpocl.so.2.6.0(POclGetDeviceIDs+0x79)[0x7fcc00d352c9]
[ACER:08543] [ 6] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/python3.8/site-packages/pyopencl/../../../libOpenCL.so.1(clGetDeviceIDs+0x54)[0x7fcba2db9764]
[ACER:08543] [ 7] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/python3.8/site-packages/pyopencl/_cl.cpython-38-x86_64-linux-gnu.so(+0x5bfd9)[0x7fcba2b0afd9]
[ACER:08543] [ 8] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/python3.8/site-packages/pyopencl/_cl.cpython-38-x86_64-linux-gnu.so(+0x6e1be)[0x7fcba2b1d1be]
[ACER:08543] [ 9] /home/dshtey2/CEESD/heatMirge/emirge/miniforge3/envs/dgfem1/lib/python3.8/site-packages/pyopencl/_cl.cpython-38-x86_64-linux-gnu.so(+0x2ba91)[0x7fcba2adaa91]
[ACER:08543] [10] python3(PyCFunction_Call+0x54)[0x557f9ecb3a34]
[ACER:08543] [11] python3(_PyObject_MakeTpCall+0x31e)[0x557f9ecacb8e]
[ACER:08543] [12] python3(+0x1b028e)[0x557f9ed3928e]
[ACER:08543] [13] python3(_PyEval_EvalFrameDefault+0x4c93)[0x557f9ed5b053]
[ACER:08543] [14] python3(_PyEval_EvalCodeWithName+0x9be)[0x557f9ed377be]
[ACER:08543] [15] python3(_PyFunction_Vectorcall+0x378)[0x557f9ed384c8]
[ACER:08543] [16] python3(_PyEval_EvalFrameDefault+0x4c93)[0x557f9ed5b053]
[ACER:08543] [17] python3(_PyEval_EvalCodeWithName+0x9be)[0x557f9ed377be]
[ACER:08543] [18] python3(_PyFunction_Vectorcall+0x378)[0x557f9ed384c8]
[ACER:08543] [19] python3(PyObject_Call+0x5e)[0x557f9ecadaae]
[ACER:08543] [20] python3(_PyEval_EvalFrameDefault+0x21b1)[0x557f9ed58571]
[ACER:08543] [21] python3(_PyEval_EvalCodeWithName+0x2c3)[0x557f9ed370c3]
[ACER:08543] [22] python3(_PyFunction_Vectorcall+0x378)[0x557f9ed384c8]
[ACER:08543] [23] python3(_PyEval_EvalFrameDefault+0x92f)[0x557f9ed56cef]
[ACER:08543] [24] python3(_PyEval_EvalCodeWithName+0x2c3)[0x557f9ed370c3]
[ACER:08543] [25] python3(PyEval_EvalCodeEx+0x39)[0x557f9ed38149]
[ACER:08543] [26] python3(PyEval_EvalCode+0x1b)[0x557f9eddea3b]
[ACER:08543] [27] python3(+0x2722de)[0x557f9edfb2de]
[ACER:08543] [28] python3(+0x1277db)[0x557f9ecb07db]
[ACER:08543] [29] python3(_PyEval_EvalFrameDefault+0x92f)[0x557f9ed56cef]
[ACER:08543] *** End of error message ***
Segmentation fault
dshtey2 commented 3 years ago

As an additional note, I have installed an earlier iteration of emirge and had no issues running the examples or making my own; this issue only started after downloading the most recent version.

inducer commented 3 years ago

WSL1 or 2? (WSL1 is not supported.)

dshtey2 commented 3 years ago

According to Powershell I am running version 2

MTCam commented 3 years ago

So, I find it a little suspicious that you used a new version of emirge, yet these errors appear to indicate that your code is still using an environment called dgfem - which I think has gone away some time back.

Every install of MIRGE-Com using emirge will install a whole new environment (called "ceesd", by default i think). One needs to make sure to activate it by doing source emirge/config/activate.sh before running any examples.

Could you send along the output of conda env list? Maybe it will provide some clues.

majosm commented 3 years ago

So, I find it a little suspicious that you used a new version of emirge, yet these errors appear to indicate that your code is still using an environment called dgfem - which I think has gone away some time back.

dgfem1 🙂

It was a fresh clone of emirge, installed with ./install.sh --conda-prefix=<existing conda install> --env-name=dgfem1. The env was activated properly.

dshtey2 commented 3 years ago

Here is my output:

# conda environments:
#
                         ~/Docs/emirge/miniforge3
                         ~/Docs/emirge/miniforge3/envs/dgfem
base                     ~/Docs/heatMirge/emirge/miniforge3
ceesd                    ~/Docs/heatMirge/emirge/miniforge3/envs/ceesd
dgfem                    ~/Docs/heatMirge/emirge/miniforge3/envs/dgfem
dgfem1                *  ~/Docs/heatMirge/emirge/miniforge3/envs/dgfem1
                         ~/miniforge3
matthiasdiener commented 3 years ago

I would try installing a fresh env, ie. without --conda-prefix=<existing conda install>

MTCam commented 3 years ago

I suspect it might start working if he activates the new ceesd environment.

majosm commented 3 years ago

I suspect it might start working if he activates the new ceesd environment.

dgfem1 is the correct env. See above.

majosm commented 3 years ago

I would try installing a fresh env, ie. without --conda-prefix=<existing conda install>

@dshtey2 This might be worth a try. Just make sure to set your PATH so it picks up the new conda instead of the old one.

matthiasdiener commented 3 years ago

I'm suspecting this PR might be a part of the problem: https://github.com/illinois-ceesd/mirgecom/pull/200

dshtey2 commented 3 years ago

I would try installing a fresh env, ie. without --conda-prefix=<existing conda install>

@dshtey2 This might be worth a try. Just make sure to set your PATH so it picks up the new conda instead of the old one.

I just ran another fresh install, this time without the --conda-prefix and setting the PATH to the new conda directory, but still to no avail