RuntimeError: Could not infer dtype of numpy.int64

rdrighetto commented 10 months ago

Hi, I'm trying to run topaz denoise3d for the first time and getting the following error:

diogori@worker08:tomo2_L1G1$ topaz denoise3d -o topaz_unet-3d -m unet-3d tomo2_L1G1-dose_filt.mrc -d 1
/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/package/_directory_reader.py:17: UserWarning: Failed to initialize NumPy: No module named 'numpy.core._multiarray_umath' (Triggered internally at  /opt/conda/conda-bld/pytorch_1640811701593/work/torch/csrc/utils/tensor_numpy.cpp:68.)
  _dtype_to_storage = {data_type(0).dtype: data_type for data_type in _storages}
# loading pretrained model: unet-3d-10a-v0.2.4.sav
# denoising with patch size=96 and padding=48
Traceback (most recent call last):
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/bin/topaz", line 33, in <module>
    sys.exit(load_entry_point('topaz-em==0.2.5', 'console_scripts', 'topaz')())
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/topaz/main.py", line 148, in main
    args.func(args)
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/topaz/commands/denoise3d.py", line 773, in main
    , total_volumes=total
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/topaz/commands/denoise3d.py", line 665, in denoise
    for index,x in batch_iterator:
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
    return self.collate_fn(data)
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 84, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 84, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in default_collate
    return default_collate([torch.as_tensor(b) for b in batch])
  File "/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in <listcomp>
    return default_collate([torch.as_tensor(b) for b in batch])
RuntimeError: Could not infer dtype of numpy.int64

And here is the MRC file header:

diogori@worker08:tomo2_L1G1$ header tomo2_L1G1-dose_filt.mrc

 RO image file on unit   1 : tomo2_L1G1-dose_filt.mrc     Size=    1560897 K

 Number of columns, rows, sections .....     928     928     464
 Map mode ..............................    2   (32-bit float)             
 Start cols, rows, sects, grid x,y,z ...    0     0     0     928    928    464
 Pixel spacing (Angstroms)..............   14.08      14.08      14.08    
 Cell angles ...........................   90.000   90.000   90.000
 Fast, medium, slow axes ...............    X    Y    Z
 Origin on x,y,z ..(inverted_in_file_)..    0.000       0.000       3267.    
 Minimum density .......................  -17.010    
 Maximum density .......................   13.213    
 Mean density ..........................  0.73609E-01
 tilt angles (original,current) ........  90.0   0.0   0.0   0.0   0.0   0.0
 Space group,# extra bytes,idtype,lens .        1        0        0        0

     8 Titles :
TOMOMAN: Frames aligned with MotionCor2.                                       
TOMOMAN: Exposure filtered on images with 0.8002 e/(A^2)/s                     
CCDERASER: Bad points replaced with interpolated values 28-Oct-20  10:07:05    
NEWSTACK: Images copied, transformed                    19-Aug-22  14:49:48    
ctfPhaseFlip: CTF correction with phase flipping only   19-Aug-22  14:50:13    
TILT: Tomographic reconstruction                        19-Aug-22  14:50:56    
NEWSTACK: Images copied                                 19-Aug-22  14:54:47    
clip: rotx - rotation by -90 around X                   19-Aug-22  14:54:56

I'm on CentOS 7.9.2 trying to run it on NVIDIA A40 GPUs. Please find below my conda environment:

(/scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env) diogori@worker08:topaz$ conda list
# packages in environment at /scicore/home/engel0006/GROUP/pool-engel/soft/topaz/topaz_env:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
_openmp_mutex             5.1                       1_gnu  
blas                      1.0                         mkl  
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2023.08.22           h06a4308_0  
certifi                   2021.5.30        py36h06a4308_0  
cuda-cudart               11.8.89                       0    nvidia
cuda-cupti                11.8.87                       0    nvidia
cuda-libraries            11.8.0                        0    nvidia
cuda-nvrtc                11.8.89                       0    nvidia
cuda-nvtx                 11.8.86                       0    nvidia
cuda-runtime              11.8.0                        0    nvidia
cudatoolkit               11.8.0               h6a678d5_0  
dataclasses               0.8                pyh4f3eec9_6  
ffmpeg                    4.3                  hf484d3e_0    pytorch
freetype                  2.12.1               h4a9f257_0  
future                    0.18.2                   py36_1  
giflib                    5.2.1                h5eee18b_3  
gmp                       6.2.1                h295c915_3  
gnutls                    3.6.15               he1e5248_0  
intel-openmp              2022.1.0          h9e868ea_3769  
joblib                    1.0.1              pyhd3eb1b0_0  
jpeg                      9e                   h5eee18b_1  
lame                      3.100                h7b6447c_0  
lcms2                     2.12                 h3be6417_0  
ld_impl_linux-64          2.38                 h1181459_1  
lerc                      3.0                  h295c915_0  
libcublas                 11.11.3.6                     0    nvidia
libcufft                  10.9.0.58                     0    nvidia
libcufile                 1.7.2.10                      0    nvidia
libcurand                 10.3.3.141                    0    nvidia
libcusolver               11.4.1.48                     0    nvidia
libcusparse               11.7.5.86                     0    nvidia
libdeflate                1.17                 h5eee18b_0  
libffi                    3.3                  he6710b0_2  
libgcc-ng                 11.2.0               h1234567_1  
libgfortran-ng            7.5.0               ha8ba4b0_17  
libgfortran4              7.5.0               ha8ba4b0_17  
libgomp                   11.2.0               h1234567_1  
libiconv                  1.16                 h7f8727e_2  
libidn2                   2.3.4                h5eee18b_0  
libnpp                    11.8.0.86                     0    nvidia
libnvjpeg                 11.9.0.86                     0    nvidia
libpng                    1.6.39               h5eee18b_0  
libstdcxx-ng              11.2.0               h1234567_1  
libtasn1                  4.19.0               h5eee18b_0  
libtiff                   4.5.1                h6a678d5_0  
libunistring              0.9.10               h27cfd23_0  
libuv                     1.44.2               h5eee18b_0  
libwebp                   1.2.4                h11a3e52_1  
libwebp-base              1.2.4                h5eee18b_1  
lz4-c                     1.9.4                h6a678d5_0  
mkl                       2020.2                      256  
mkl-service               2.3.0            py36he8ac12f_0  
ncurses                   6.4                  h6a678d5_0  
nettle                    3.7.3                hbbd107a_1  
numpy                     1.14.2           py36hdbf6ddf_0  
olefile                   0.46               pyhd3eb1b0_0  
openh264                  2.1.1                h4ff587b_0  
openssl                   1.1.1w               h7f8727e_0  
pandas                    0.25.3           py36he6710b0_0  
pillow                    8.3.1            py36h5aabda8_0  
pip                       21.2.2           py36h06a4308_0  
python                    3.6.13               h12debd9_1  
python-dateutil           2.8.2              pyhd3eb1b0_0  
pytorch                   1.10.2              py3.6_cpu_0    pytorch
pytorch-cuda              11.8                 h7e8668a_5    pytorch
pytorch-mutex             1.0                         cpu    pytorch
pytz                      2021.3             pyhd3eb1b0_0  
readline                  8.2                  h5eee18b_0  
scikit-learn              0.22.1           py36hd81dba3_0  
scipy                     1.3.2            py36h7c811a0_0  
setuptools                58.0.4           py36h06a4308_0  
six                       1.16.0             pyhd3eb1b0_1  
sqlite                    3.41.2               h5eee18b_0  
tk                        8.6.12               h1ccaba5_0  
topaz                     0.2.5                      py_0    tbepler
torchaudio                0.10.2                 py36_cpu    pytorch
torchvision               0.11.3                 py36_cpu    pytorch
typing_extensions         4.1.1              pyh06a4308_0  
wheel                     0.37.1             pyhd3eb1b0_0  
xz                        5.4.2                h5eee18b_0  
zlib                      1.2.13               h5eee18b_0  
zstd                      1.5.5                hc292b87_0

Any ideas how to fix this?

Thank you!

DarnellGranberry commented 10 months ago

I believe I've found the source of the issue. We create a torch DataLoader from the PatchDataset, but that dataset returns numpy arrays and not torch tensors. Working on a quick fix now.

DarnellGranberry commented 10 months ago

This actually doesn't appear to be an issue with later versions of pytorch (1.13.1) and numpy (1.24.3). Also, when I try to install those in a new environment, conda tells me that your numpy version is too low for your pytorch version.

Could you update numpy (and pytorch if that doesn't work) and try it again?

rdrighetto commented 10 months ago

OK, the issue was that I was using python=3.6 (as per the installation instructions). When switching to python=3.8 I managed to get the newer versions of pytorch and numpy among other things. Also the GPU versions of pytorch libraries, which somehow were not installed before. In summary, this is what did the trick:

mamba remove torchaudio
mamba install python==3.8
mamba install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
mamba install topaz -c tbepler -c pytorch

Thank you!

tbepler / topaz

RuntimeError: Could not infer dtype of numpy.int64 #175