jamesdolezal / slideflow

Deep learning library for digital pathology, with both Tensorflow and PyTorch support.
https://slideflow.dev
GNU General Public License v3.0
234 stars 39 forks source link

[BUG] Slideflow Studio crashes on M1 Mac, Mac installation bugs #283

Open skochanny opened 1 year ago

skochanny commented 1 year ago

Description

On my M1 Mac, after installing slideflow via conda and pip

To Reproduce

Steps to reproduce the behavior:

  1. Installing packages

Since I thought this was initially about package conflicts, I essentially started from scratch on my Mac with everything. I reinstalled conda & cleaned its cache, cleaned pip's cache, reinstalled homebrew, reinstalled xcode, and updated my Mac to the latest version. Then, I installed slideflow. As prereqs I needed the xcode dev kit installed, as well as libvips, which I installed with Homebrew. xcode-select --install brew update brew upgrade brew install vips Then installed the conda environment in python3.8 with additional dependencies pyqt imagecodecs and pyqtgraph which are required for Studio. Then installed the tensorflow and pytorch requirements with conda as well. Finally, cloned slideflow's git repo, switched to the dev-2.1 branch. In the requirements file I commented out cplex, smac, and cucim as they aren't available for the Mac M1. conda create -y -n sf python=3.8 pyqt imagecodecs pyqtgraph numcodecs (installed these upon env creation else if trying during pip install slideflow PyQt5.5.15.9 package is downloaded and that wheel is not compiled for M1.) conda activate sf conda install pytorch::pytorch torchvision torchaudio -c pytorch conda install -c apple -y tensorflow-deps python3 -m pip install tensorflow-macos tensorflow-metal python3 -m pip install cffi (otherwise error with pyvips) git clone https://github.com/jamesdolezal/slideflow.git cd slideflow git checkout dev-2.1 python3 -m pip install -r requirements.txt

Full environment list: sf_env_m1mac.txt

  1. Segfault error with glfwGetMonitorWorkarea

I started slideflow studio with python3 slideflow-studio.py but got a Segmentation Fault 11. Full crash report from apple attached in segfault_gflw.txt. The thread crashed at glfwGetMonitorWorkarea segfault_glfw.txt

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libglfw.3.dylib 0x114b6b878 glfwGetMonitorWorkarea + 20 1 libffi.8.dylib 0x10569004c ffi_call_SYSV + 76 2 libffi.8.dylib 0x10568d74c ffi_call_int + 1208 3 _ctypes.cpython-38-darwin.so 0x1056c0574 _ctypes_callproc + 1176 4 _ctypes.cpython-38-darwin.so 0x1056ba8c8 PyCFuncPtr_call + 1164 5 python3.8 0x1045c3af0 _PyObject_MakeTpCall + 744 6 python3.8 0x1046b0158 call_function + 612 7 python3.8 0x1046ac83c _PyEval_EvalFrameDefault + 27176

I used a Python3 interactive shell to verify that OpenGL, GLFW, and Slideflow could be successfully imported. I also used the Xquartz OpenGL test programs to verify that OpenGL/GLFW libraries on my Mac's system were installed and working properly. I also downloaded cellpose and napari and tried out their GUIs, and they worked without issue. I then began troubleshooting slideflow-studio.py to try and narrow down the issue.

I found that the segfault was being caused by some issue with a single function in the slideflow code: glfw.get_monitor_workarea() which is used to get the width/height of the monitor. It’s used in both the slideflow-studio.py file (for the splash screen) as well as in studio._glfw.py to initialize the main window. I noticed that this was the only thing that it was used for, so I went ahead and commented out those lines and replaced them with the monitor’s width & height (1400, 900), which I got from glfw.get_video_mode() used right below the glfw.get_monitor_workarea() part of the code in slideflow-studio.py. Then it worked, and the splash screen came up, but it still crashed when trying to initialize the main window. So I also did the same thing in studio._glfw.py and then the main slideflow window popped up. I can click around, and even load in a slide. It’s fast, and it works. So it appears that, aside from this single function, OpenGL and GLFW are working properly, but there is some bug with either glfw.get_monitor_workarea() from the glfw python binding or possibly from the base GLFW glfwGetMonitorWorkarea.

  1. Bus error with ImageIO

However… then i came to another issue: a Bus error which crashes the program. Sometimes it appears to happen randomly, but it is definitely triggered when i am trying to use the File>Open Slide or File >Open Project, but I noticed it also occurs when I am stretch the window’s size too far beyond the boundaries of my monitor. The Open stuff requires a new window to pop up, and stretching the monitor could also be an issue with querying monitor size. So once again, it feels like it comes down to this whole “get monitor workarea” bug. But in the most recent traceback, the crash doesn’t happen with that specific GLFW function. It happens with ImageIO, which is an Apple package. Or perhaps rather, a thread dispatch call. bus_error.txt

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_BAD_ACCESS (SIGBUS) Exception Codes: UNKNOWN_0x101 at 0x000000000bad4007 Exception Codes: 0x0000000000000101, 0x000000000bad4007

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 ??? 0xbad4007 ??? 1 ImageIO 0x1a06ec208 IIOReadPlugin::callInitialize() + 136 2 ImageIO 0x1a06ebf1c IIO_Reader::initImageAtOffset(CGImagePlugin, unsigned long, unsigned long, unsigned long) + 124 3 ImageIO 0x1a06e97b0 IIOImageSource::makeImagePlus(unsigned long, IIODictionary) + 808 4 ImageIO 0x1a06f5aa0 IIOImageSource::createImageAtIndex(unsigned long, IIODictionary*) + 80 5 ImageIO 0x1a06f5970 CGImageSourceCreateImageAtIndex + 276 6 HIServices 0x19c2bd12c setCursorFromBundle + 1488 7 HIServices 0x19c2bc424 CoreCursorSetAndReturnSeed + 204

Expected behavior

Slideflow Studio to open

Environment:

jamesdolezal commented 1 year ago

Thanks, Sara. The error is not able to be reproduced in all environments, which makes me think this could be a software dependency issue. I will start by comparing software versions in detail with M1/M2 systems that work, to see if I can track down potential culprits. Can you upload your conda environment, so I can reproduce it locally?

skochanny commented 1 year ago

environment.txt I had to change the file extension to .txt so Github would let me upload it, but if you change it back to .yml it should work.