MannLabs / alphatims

An open-source Python package for efficient accession and visualization of Bruker TimsTOF raw data from the Mann Labs at the Max Planck Institute of Biochemistry.
https://doi.org/10.1016/j.mcpro.2021.100149
Apache License 2.0
81 stars 26 forks source link

Unable to read data folder on some systems #250

Closed liquidcarbon closed 1 year ago

liquidcarbon commented 1 year ago

I'm getting a strange error on a new system where I'm not completely in control of the OS (it's some flavor of containerized Debian). This is happening on alphatims 1.0.6 and 1.0.5.

Any ideas on troubleshooting this?

Alphatims installs without problems via pip, including the Bruker DLL:

from alphatims.bruker import BRUKER_DLL_FILE_NAME
!ls -l $BRUKER_DLL_FILE_NAME

-rw-r--r-- 1 root root 18668200 Jan  4 05:07 /usr/local/lib/python3.10/site-packages/alphatims/ext/timsdata.so

Then it starts reading the files and tqdm recognizes the correct number of frames, but then crashes:

from alphatims.bruker import TimsTOF
data = TimsTOF("/lipidomics.d")

100%|##########| 11106/11106 [00:13<00:00, 823.93it/s]
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File /usr/local/lib/python3.10/site-packages/alphatims/bruker.py:131, in open_bruker_d_folder(bruker_d_folder_name, bruker_dll_file_name)
    130 if isinstance(bruker_dll_file_name, str):
--> 131     bruker_dll = init_bruker_dll(bruker_dll_file_name)
    132 logging.info(f"Opening handle for {bruker_d_folder_name}")
File /usr/local/lib/python3.10/site-packages/alphatims/bruker.py:67, in init_bruker_dll(bruker_dll_file_name)
     66 import ctypes
---> 67 bruker_dll = ctypes.cdll.LoadLibrary(
     68     os.path.realpath(bruker_dll_file_name)
     69 )
     70 bruker_dll.tims_open.argtypes = [ctypes.c_char_p, ctypes.c_uint32]
File /usr/local/lib/python3.10/ctypes/__init__.py:452, in LibraryLoader.LoadLibrary(self, name)
    451 def LoadLibrary(self, name):
--> 452     return self._dlltype(name)
File /usr/local/lib/python3.10/ctypes/__init__.py:374, in CDLL.__init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    373 if handle is None:
--> 374     self._handle = _dlopen(self._name, mode)
    375 else:
OSError: libgomp.so.1: cannot open shared object file: No such file or directory
During handling of the above exception, another exception occurred:
UnboundLocalError                         Traceback (most recent call last)
Cell In[6], line 2
      1 from alphatims.bruker import TimsTOF
----> 2 data = TimsTOF("/lipidomics.d")
File /usr/local/lib/python3.10/site-packages/alphatims/bruker.py:1016, in TimsTOF.__init__(self, bruker_d_folder_name, mz_estimation_from_frame, mobility_estimation_from_frame, slice_as_dataframe, use_calibrated_mz_values_as_default, use_hdf_if_available, mmap_detector_events, drop_polarity, convert_polarity_to_int)
   1012     else:
   1013         self.bruker_d_folder_name = os.path.abspath(
   1014             bruker_d_folder_name
   1015         )
-> 1016         self._import_data_from_d_folder(
   1017             bruker_d_folder_name,
   1018             mz_estimation_from_frame,
   1019             mobility_estimation_from_frame,
   1020             drop_polarity,
   1021             convert_polarity_to_int,
   1022             mmap_detector_events,
   1023         )
   1024 elif bruker_d_folder_name.endswith(".hdf"):
   1025     self._import_data_from_hdf_file(
   1026         bruker_d_folder_name,
   1027         mmap_detector_events,
   1028     )
File /usr/local/lib/python3.10/site-packages/alphatims/bruker.py:1114, in TimsTOF._import_data_from_d_folder(self, bruker_d_folder_name, mz_estimation_from_frame, mobility_estimation_from_frame, drop_polarity, convert_polarity_to_int, mmap_detector_events)
   1112 if (mobility_estimation_from_frame != 0) and bruker_dll_available:
   1113     import ctypes
-> 1114     with alphatims.bruker.open_bruker_d_folder(
   1115         bruker_d_folder_name
   1116     ) as (bruker_dll, bruker_d_folder_handle):
   1117         logging.info(
   1118             f"Fetching mobility values from {bruker_d_folder_name}"
   1119         )
   1120         indices = np.arange(self.scan_max_index).astype(np.float64)
File /usr/local/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
    133 del self.args, self.kwds, self.func
    134 try:
--> 135     return next(self.gen)
    136 except StopIteration:
    137     raise RuntimeError("generator didn't yield") from None
File /usr/local/lib/python3.10/site-packages/alphatims/bruker.py:140, in open_bruker_d_folder(bruker_d_folder_name, bruker_dll_file_name)
    138 finally:
    139     logging.info(f"Closing handle for {bruker_d_folder_name}")
--> 140     bruker_dll.tims_close(bruker_d_folder_handle)
UnboundLocalError: local variable 'bruker_dll' referenced before assignment

The data files were retrieved like this:

import os
from ftplib import FTP

def start_ftp():
    sample_path = "MSV000084402/raw/SRM1950_20min_88_01_6950.d"
    ftp = FTP("massive.ucsd.edu")
    ftp.login()
    ftp.cwd(sample_path)
    return ftp

ftp = start_ftp()
if not os.path.exists("lipidomics.d"):
    os.mkdir("lipidomics.d")
with open("lipidomics.d/analysis.tdf_bin", "wb") as f:
    ftp.retrbinary("RETR " + "analysis.tdf_bin", f.write)
with open("lipidomics.d/analysis.tdf", "wb") as f:
    ftp.retrbinary("RETR " + "analysis.tdf", f.write)
liquidcarbon commented 1 year ago

Solved with !apt-get update && apt-get install libgomp1 Seems like enforcing these C dependencies could be useful, but I have no idea how to do it. :)

sander-willems-bruker commented 1 year ago

Thanks for finding this and already providing the fix. While it might be possible to enforce this, I am not a fan of doing so if this cannot be strictly contained in a virtual environment without being invasive on the OS. I have added a minor comment on the troublesooting section of the readme that includes the apt-get install libgomp1 fix.