Closed cavenel closed 1 year ago
Not sure what is causing this, but @smistad can likely aid you further.
Assuming that OpenCL is properly setup, it could be that you have the same issue as I did on my macbook. It is an older macbook (non-silicon). On my macbook I have two GPUs: one AMD GPU and one integrated Intel GPU.
For whatever reason, when using the AMD GPU I got a similar error as you got above. After switching to use the Intel GPU, the problem was resolved.
By default, the best GPU will be selected by the OS. In order to force the OS to use the Intel GPU, which should be compatible with FP, I used a program called gfxCardStatus. After installing it, you can choose to toggle between GPUs from the top bar. "Dynamic switching" should be enabled by default, but if you set "Integrated only" it should use the integrated GPU which should be compatible with FP. See example below. Note that you cannot have a monitor connected to the macbook when using the "Integrated only" option.
BTW: If I remember correctly, I was able to render the WSI using the AMD GPU, but running any analysis, even tissue segmentation, resulted in the same error as you had. Are you able to render the WSI?
I was now able to reproduce on a Macbook Pro (Quad-Core Intel Core i7 / Intel Iris Pro 1536 MB) I first had the exact same error message than above, with macOS 12.6.5. The error occures when trying to run a pipeline.
I then updated the macOS version to 12.6.8 and now I get a different error, when trying to load any image:
ERROR [0x10a68a600] OpenCL exception caught in Qt event handler clCreateProgramWithBinary(Invalid value)
I tried the gfxCardStatus idea, but I only have one GPU apparently.
Do you know how to make sure OpenCL is correctly installed? I haven't done any specific installation for it.
Thanks!
I did not have the same issue on my macbook, also running Monterey (v12 macOS).
As this seems OpenCL-related, @smistad can likely assist you further.
Also, I assume you get the same for all pipelines, including the simple tissue segmenter?
Yes I can confirm that (when I could add images) I had the issue with all pipelines.
Not sure I can be of much help with OpenCL-issues, but what do you get when you run clinfo
in the terminal? You might need to install it first by brew install clinfo
Here is the output of clinfo:
Number of platforms 1
Platform Name Apple
Platform Vendor Apple
Platform Version OpenCL 1.2 (Aug 8 2022 21:29:33)
Platform Profile FULL_PROFILE
Platform Extensions cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event
Platform Name Apple
Number of devices 2
Device Name Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
Device Vendor Intel
Device Vendor ID 0xffffffff
Device Version OpenCL 1.2
Driver Version 1.1
Device OpenCL C Version OpenCL C 1.2
Device Type CPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 8
Max clock frequency 2200MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1x1
Max work group size 1024
Preferred work group size multiple (kernel) 1
Preferred / native vector sizes
char 16 / 16
short 8 / 8
int 4 / 4
long 2 / 2
half 0 / 0 (n/a)
float 4 / 4
double 2 / 2 (cl_khr_fp64)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 17179869184 (16GiB)
Error Correction support No
Max memory allocation 4294967296 (4GiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type Read/Write
Global Memory cache size 64
Global Memory cache line size 6291456 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 65536 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 1 bytes
Pitch alignment for 2D image buffers 1 pixels
Max 2D image size 8192x8192 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Global
Local memory size 32768 (32KiB)
Max number of constant args 8
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 4096 (4KiB)
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 1ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels Yes
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_image2d_from_buffer cl_APPLE_fp64_basic_ops cl_APPLE_fixed_alpha_channel_orders cl_APPLE_biased_fixed_point_image_formats cl_APPLE_command_queue_priority
Device Name Iris Pro
Device Vendor Intel
Device Vendor ID 0x1024500
Device Version OpenCL 1.2
Driver Version 1.2(Jul 6 2023 23:52:47)
Device OpenCL C Version OpenCL C 1.2
Device Type GPU
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 40
Max clock frequency 1200MHz
Device Partition (core)
Max number of sub-devices 0
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 512x512x512
Max work group size 512
Preferred work group size multiple (kernel) 32
Preferred / native vector sizes
char 1 / 1
short 1 / 1
int 1 / 1
long 1 / 1
half 0 / 0 (n/a)
float 1 / 1
double 0 / 0 (n/a)
Half-precision Floating-point support (n/a)
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (n/a)
Address bits 64, Little-Endian
Global memory size 1610612736 (1.5GiB)
Error Correction support No
Max memory allocation 402653184 (384MiB)
Unified memory for Host and Device Yes
Minimum alignment for any data type 128 bytes
Alignment of base address 1024 bits (128 bytes)
Global Memory cache type None
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 25165824 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 4 bytes
Pitch alignment for 2D image buffers 32 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 8
Local memory type Local
Local memory size 65536 (64KiB)
Max number of constant args 8
Max constant buffer size 65536 (64KiB)
Max size of kernel argument 1024
Queue properties
Out-of-order execution No
Profiling Yes
Prefer user sync for interop Yes
Profiling timer resolution 80ns
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
printf() buffer size 1048576 (1024KiB)
Built-in kernels (n/a)
Device Extensions cl_APPLE_SetMemObjectDestructor cl_APPLE_ContextLoggingFunctions cl_APPLE_clut cl_APPLE_query_kernel_names cl_APPLE_gl_sharing cl_khr_gl_event cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_image2d_from_buffer cl_khr_gl_depth_images cl_khr_depth_images cl_khr_3d_image_writes
NULL platform behavior
clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...) Apple
clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...) Success [P0]
clCreateContext(NULL, ...) [default] Success [P0]
clCreateContextFromType(NULL, CL_DEVICE_TYPE_DEFAULT) Success (1)
Platform Name Apple
Device Name Iris Pro
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU) Success (1)
Platform Name Apple
Device Name Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU) Success (1)
Platform Name Apple
Device Name Iris Pro
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR) No devices found in platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM) Invalid device type for platform
clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL) Success (2)
Platform Name Apple
Device Name Iris Pro
Device Name Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
I removed all FAST and fastpathology files from my home folder and restarted a new project. Now I am back to the original error.
It seems as if the error always comes after dilation:
INFO [0x7000050af000] Added renderer SegmentationRenderer with id segRenderer
OK
Done
INFO [0x7000053c1000] EXECUTING EmptyProcessObject because PO is modified.
INFO [0x7000053c1000] EXECUTING EmptyProcessObject because PO is modified.
INFO [0x7000053c1000] EXECUTING TissueSegmentation because PO is modified.
INFO [0x7000053c1000] EXECUTING EmptyProcessObject because PO is modified.
INFO [0x7000053c1000] EXECUTING ImageChannelConverter because PO is modified.
INFO [0x7000053c1000] EXECUTING EmptyProcessObject because PO is modified.
INFO [0x7000053c1000] EXECUTING Dilation because PO is modified.
ERROR [0x7000053c1000] Build log, device 0
<program source>:13:3: error: expected expression
} else {
^
ERROR [0x7000053c1000] Program build failure
stopping pipeline
done
removing renderers..
done
stopping pipeline
done
removing renderers..
done
But I don't see any } else {
in the dilation.cl file of FAST.
Just ran clinfo
on my own macbook where everything works, and I was unable to see relevant differences, other than the fact that you have an Intel Iris integrated GPU, whereas I have an Intel HD Graphics 530.
I also tried to reinstall FP using the very latest release, but I could not reproduce this issue.
Do you observe the same if you use the CMU-1.svs
from the OpenSlide test data suite?
Also could you try running fastpathology with verbose enabled, unless that was already the case, like so:
/Applications/FastPathology.app/Contents/MacOS/bin/fastpathology --verbose
Thanks for spending some time on this! It also fails on CMU-1.svs directly loaded from fastpathology. The last output was indeed with verbose on.
Very strange. I'm have no idea what is causing this. Debugging is also challenging as I cannot reproduce the issue myself, but it is apparent that something is not working, as you have observed the same bug on two separate machines.
I will assign @smistad to this issue. He is likely off for the weekend, but may reply early next week.
Lastly, could you check that glxgears
and xeyes
render properly? You might need to brew install glxgears
.
I just ran a small test where I deleted both the FAST/
and FastPathology/
directories created by FastPathology and reran the test, and by accident I had forgotten to set "Integrated only", which resulted in me managing to reproduce the bug you reported (<program source>:13:3: error: expected expression (...)
).
After forcing the OS to use the integrated GPU instead, it worked fine. So I do not think there is necessarily a FAST-bug, but rather that for whatever reason your CPU-GPU setup is not working as it should. Perhaps there could be an OpenCL-OpenGL interop issue.
Could you show me the verbose you get from launching FastPathology from the terminal (with --verbose
of course). There might be some info there that could help me in understanding the issue.
Unfortunately I am also out for the weekend with no access to the Mac computer. Will test on Monday! Thanks again
Hi
OpenCL is installed correctly, apple has their own implementation which is always installed. The error comes from apple's opencl failing to compile some OpenCL code. Since other opencl implementations are able to compile this, it points to a bug in apple's code (not the first time...).
I think it fails to compile erosion.cl https://github.com/smistad/FAST/blob/master/source/FAST/Algorithms/Morphology/Erosion.cl
Also I think this is the same issue we have seen on AMD Macs.
This can probably be resolved by rewriting the code it fails on. You can try editing the erosion.cl file yourself. It will try to recompile every time you run it. To fix it myself I need access to a Mac with this issue.
Hi @smistad, Thanks, after some testing, I think it actually fails to compile ImageFill.cl https://github.com/smistad/FAST/blob/master/source/FAST/ImageFill.cl
Apparently, Mac OS doesn't like having } else {
on one line.
Making it in two lines fixed this issue:
}
else {
Don't ask me why 🤷
I will do some more testing to see if something else needs fixing.
The Apple OpenCL compiler is not completely stable..
There are some } else {
statements in Erosion.cl as well which is used in TissueSegmentation.
Does it not have issues with this?
Let us know if you find more of these compile issues and we can add the fixes to FAST
Once fixing the missing new line in ImageFill.cl, I didn't get any other compilation errors. I tested a couple of pipelines, and they worked well. But I am not sure they actually ran the erosion.cl kernel. Would you have a minimal pipeline example running erosion?
(Edit: I see you wrote that TissueSegmentation uses Erosion.cl, but I had no issue with it. So probably means it's only in ImageFill.cl for some reason.)
Actually the space between }
and else
in ImageFill.cl is not a simple space, but character u+00a0:
That explains why MacOS is failing.
Doh! :facepalm:
I added a fix for it to FAST just now: https://github.com/smistad/FAST/commit/167f5cb6b3118fd73169817f0a0650bf654837b0 Will rebuild FastPathology with the fix soon.
Thanks for figuring this out @cavenel
@andreped Could you test if this was the cause for crashes on AMD Mac's as well?
This build should contain the fix: https://github.com/AICAN-Research/FAST-Pathology/actions/runs/6072324757
@andreped Could you test if this was the cause for crashes on AMD Mac's as well?
Works wonders on both AMD GPU and dedicated Intel GPU now, @smistad :] Great job, and good catch, @cavenel!
Scale bar is also working fine!
@cavenel Could you verify that the latest release of FP (v1.1.2) works out-of-the-box on your macbook? https://github.com/AICAN-Research/FAST-Pathology/releases/tag/v1.1.2
Yes I can confirm in now works out-of-the-box with version 1.1.2! Thanks for the fast fix! Will you also add the arm64 asset?
Yes, I will compile it for arm64 as well, but I have to do it manually still since github doesn't offer arm64 macOS runners yet
@cavenel I just uploaded an macos arm64 package to the release. I haven't been able to test it my self, so let me know if it doesn't work for some reason.
Not sure yet. I never tested on M1 before, but now I get stuck at launch, it can not find libomp.dynlib:
dyld[26049]: Library not loaded: /opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib
Referenced from: <2780B264-9867-3136-9A51-CEE6AE3D1F61> /Applications/FastPathology.app/Contents/MacOS/lib/libFAST.4.7.1.dylib
Reason: tried: '/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/usr/local/lib/libomp.dylib' (no such file), '/usr/lib/libomp.dylib' (no such file, not in dyld cache)
Abort trap: 6
I did install libomp from homebrew:
libomp 16.0.6 is already installed and up-to-date.
and libomp.dylib is in /opt/homebrew/opt/libomp/lib/
:
bash-3.2$ ls -al /opt/homebrew/opt/libomp/lib/
total 3144
drwxr-xr-x 4 cavenel admin 128 Sep 7 11:54 .
drwxr-xr-x 6 cavenel admin 192 Sep 7 11:54 ..
-r--r--r-- 1 cavenel admin 880472 Jun 11 00:58 libomp.a
-r--r--r-- 1 cavenel admin 726480 Sep 7 11:54 libomp.dylib
I wonder if that's because fastpathology is looking for v14.0.6 specifically and homebrew installed 16.0.6 instead.
Not sure yet. I never tested on M1 before, but now I get stuck at launch, it can not find libomp.dynlib:
@cavenel This is related to issue https://github.com/AICAN-Research/FAST-Pathology/issues/73
Basically, we do not yet bundle these dependencies as part of FastPathology. These are installed separately, and thus if these gets updated, then FP might no longer work.
@smistad should known which versions of these dependencies you should install, and then reinstalling (downgrading/upgrading) should resolve the issue. There are likely other deps that could have issues, but for this specific dependency, you could try running:
brew install libomp@14.0.6
If this exact formula is not available, try another version 14.x.y
.
Unfortunately, libomp 16.0.6 seems to be the only available version in brew for arm64. But it is unrelated to this specific issue here, so we can probably move to #73!
Yes, brew is very annyoing; suddenly updating packages, and then not supporting installation of older versions. But, the plan is to package all the dependencies on macOS to avoid having to use brew at all. This is what we do on windows and ubuntu, and it is on my todo list for macOS.
Unfortunately, libomp 16.0.6 seems to be the only available version in brew for arm64. But it is unrelated to this specific issue here, so we can probably move to #73!
The arm64 build I uploaded was built with libomp 16.0.6. The x86_64 version build is with v14 of libomp.
The arm64 build I uploaded was built with libomp 16.0.6. The x86_64 version build is with v14 of libomp.
But then the error message @cavenel gets for M1 doesn't make sense, as it looks like FP expects libomp v14
.
dyld[26049]: Library not loaded: /opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib
Referenced from: <2780B264-9867-3136-9A51-CEE6AE3D1F61> /Applications/FastPathology.app/Contents/MacOS/lib/libFAST.4.7.1.dylib
Reason: tried: '/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/opt/homebrew/opt/libomp@14.0.6/lib/libomp.dylib' (no such file), '/usr/local/lib/libomp.dylib' (no such file), '/usr/lib/libomp.dylib' (no such file, not in dyld cache)
Abort trap: 6
You are right, my bad its version 14... We need to start bundling these dependencies to get out of this dependency hell on mac
I fixed it for now with:
ln -s /opt/homebrew/opt/libomp /opt/homebrew/opt/libomp@14.0.6
Which is kind of ugly and might crash if the two versions differ too much. But at least fastpathology is starting, and I was able to load models and images from tests, and run pipelines (segment nuclei and tissue). So I would say that the fix for OpenCL is ok!
PS: I have an other project using openslide and qt6, and I would really like to have an installer for Mac as we already have one for windows. So if you manage to bundle all dependencies and make a dmg out of it that would be really interesting for me too :-D
Hi,
Not sure if it is an issue with FASTPathology or a problem with the installation of openCL, but I have a user that got these error message after following all the installation steps on Mac OS:
And the terminal shows:
Is there any way to check if OpenCL is installed properly on MacOS?