nyukat / mammography_metarepository

Meta-repository of screening mammography classifiers
https://arxiv.org/abs/2108.04800
BSD 2-Clause "Simplified" License
65 stars 11 forks source link

Reproduce GMIC's result #17

Closed cyh-0 closed 3 years ago

cyh-0 commented 3 years ago

Hi,

I am trying to reproduce the GMIC's results on the CMMD dataset by using the ckpt "sample_model_1.p". I failed to accomplish the evaluation with the docker file so I wrote my own data loader (copied a few functions from the GMIC repo like Crop_mammogram, get_optimal_center, ...) to do the evaluation. All the DICOM files are converted to 16bit png. and flipped according to the side.

I failed to get the same AUC result (82.5) as reported in the paper, my reproduction result is 77.25. I am not quite sure where went wrong in my implementation so I post the image loader file below. Looking forward to hearing from you.

I am also wondering did you use the average of 10 random data augmentation (crop) method that is mentioned in the GMIC's paper to do the evaluation?

Cheers

import os
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
from src.cropping.crop_mammogram import crop_mammogram_one_image, crop_img_from_largest_connected, image_orientation
from src.optimal_centers.get_optimal_centers import extract_center
from src.data_loading.loading import process_image
from multiprocessing import Pool
import multiprocessing

def crop_mammogram_one_image_mod(image, scan, num_iterations, buffer_size):
    """
    Crops a mammogram, saves as png file, includes the following additional information:
        - window_location: location of cropping window w.r.t. original dicom image so that segmentation
           map can be cropped in the same way for training.
        - rightmost_points: rightmost nonzero pixels after correctly being flipped
        - bottommost_points: bottommost nonzero pixels after correctly being flipped
        - distance_from_starting_side: number of zero columns between the start of the image and start of
           the largest connected component w.r.t. original dicom image.
    """

    try:
        # error detection using erosion. Also get cropping information for this image.
        cropping_info = crop_img_from_largest_connected(
            image,
            image_orientation(scan['horizontal_flip'], scan['side']),
            True,
            num_iterations,
            buffer_size,
            1 / 3
        )
    except Exception as error:
        print("\n\tFailed to crop image because image is invalid.", str(error))
    else:
        top, bottom, left, right = cropping_info[0]
        cropped_img = image[top:bottom, left:right]

        return cropping_info, cropped_img

def crop_mammogram_one_image_short_path(scan, input_data_folder, output_data_folder,
                                        num_iterations, buffer_size):
    """
    Crops a mammogram from a short_file_path

    See: crop_mammogram_one_image
    """
    full_input_file_path = os.path.join(input_data_folder, scan['short_file_path'] + '.png')
    full_output_file_path = os.path.join(output_data_folder, scan['short_file_path'].split("/")[-1] + '.png')
    cropping_info = crop_mammogram_one_image_mod(
        scan=scan,
        input_file_path=full_input_file_path,
        output_file_path=full_output_file_path,
        num_iterations=num_iterations,
        buffer_size=buffer_size,
    )
    try:
        a = list(zip([scan['short_file_path']] * 4, cropping_info))
    except:
        raise ValueError("None type: {}", scan)
    return list(zip([scan['short_file_path']] * 4, cropping_info))

def check_view(fn):
    view = fn.split("_")[1].split("-")[-1]
    side = fn.split("_")[2]
    if view == "1" or view == "3":
        return "CC", side
    elif view == "2" or view == "4":
        return "MLO", side
    else:
        raise ValueError("Unknow view {}".format(view))

root = "/mnt/HD/Dataset/CMMD/processed/IMAGE_FLIP"
fn = "D1-0001_1-1_R_16bit"

def read_data(path):
    # img = os.path.join(root, "{}.png".format(fn))
    fn = path.split("/")[-1][:-4]
    img = np.asarray(Image.open(path))

    view, side = check_view(fn)

    scan = {"side":side, "horizontal_flip":"NO"}

    cropped_image_info, cropped_img = crop_mammogram_one_image_mod(
            image=img,
            scan=scan,
            num_iterations=100,
            buffer_size=50,
        )

    datum = {
        "view"                       : view,
        "full_view"                  : "{}-{}".format(side, view), 
        "window_location"            : cropped_image_info[0],
        "rightmost_points"           : cropped_image_info[1],
        "bottommost_points"          : cropped_image_info[2],
        "distance_from_starting_side": cropped_image_info[3],
    }
    opt_center = extract_center(datum, cropped_img)
    train_img = process_image(cropped_img.astype(np.float64), datum["full_view"], opt_center, norm=True)
    return train_img
jwitos commented 3 years ago

Hey @Ch1kara , of course we can look into the pasted code, but before we do I think it's a good idea to investigate why the Docker approach failed. Our goal was to make sure that Docker approach works universally so that no one has to re-implement anything. Do you mind sharing what went wrong, what's your setup etc.? Happy to help with troubleshooting.

Also,

I am also wondering did you use the average of 10 random data augmentation (crop) method that is mentioned in the GMIC's paper to do the evaluation?

No, test-time augmentations are not implemented in the GMIC code we use for inference.

cyh-0 commented 3 years ago

I forget exactly what was the issue (maybe related to the pickle file I made or some package error). I can re-run the whole process again from the beginning to see if it works this time. Thank you for your feedback!

Cheers

cyh-0 commented 3 years ago
Installing collected packages: numpy, h5py, pillow, imageio, opencv-python, pytz, python-dateutil, pandas, scipy, torch, tqdm, torchvision, pyparsing, cycler, kiwisolver, matplotlib
  Running setup.py install for pillow: started
    Running setup.py install for pillow: finished with status 'error'
    Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-d58guygh/pillow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-dkwb1x0d-record/install-record.txt --single-version-externally-managed --compile:
    /usr/lib/python3.6/distutils/dist.py:261: UserWarning: Unknown distribution option: 'long_description_content_type'
      warnings.warn(msg)
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.6
    creating build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageMorph.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/Hdf5StubImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/FtexImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/features.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageMath.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PcfFontFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/BmpImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GimpPaletteFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/IcnsImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageSequence.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GbrImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/TiffTags.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageTk.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PngImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ExifTags.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PdfImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/XVThumbImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageWin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/MicImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageStat.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/BlpImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GribStubImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/XbmImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageShow.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/JpegImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/FliImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageColor.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/MspImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/WalImageFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PaletteFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageCms.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GimpGradientFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PSDraw.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/XpmImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageGrab.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageEnhance.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/_tkinter_finder.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/__main__.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/EpsImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PsdImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/_util.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/McIdasImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/MpegImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/FontFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/WebPImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/DdsImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/_binary.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageDraw2.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageFilter.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PcdImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/SunImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PpmImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ContainerIO.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/TgaImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/FitsStubImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/TiffImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageDraw.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImagePalette.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PyAccess.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PixarImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/WmfImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PdfParser.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/BdfFontFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageOps.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/DcxImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/BufrStubImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/FpxImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageTransform.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/JpegPresets.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageFont.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PcxImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/TarIO.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/SpiderImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImtImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageMode.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/MpoImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/Image.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/SgiImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/_version.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GifImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/PalmImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/CurImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImagePath.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageChops.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/IptcImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/__init__.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/ImageQt.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/GdImageFile.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/IcoImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    copying src/PIL/Jpeg2KImagePlugin.py -> build/lib.linux-x86_64-3.6/PIL
    running egg_info
    writing src/Pillow.egg-info/PKG-INFO
    writing dependency_links to src/Pillow.egg-info/dependency_links.txt
    writing top-level names to src/Pillow.egg-info/top_level.txt
    reading manifest file 'src/Pillow.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    warning: no files found matching '*.c'
    warning: no files found matching '*.h'
    warning: no files found matching '*.sh'
    warning: no previously-included files found matching '.appveyor.yml'
    warning: no previously-included files found matching '.clang-format'
    warning: no previously-included files found matching '.coveragerc'
    warning: no previously-included files found matching '.editorconfig'
    warning: no previously-included files found matching '.readthedocs.yml'
    warning: no previously-included files found matching 'codecov.yml'
    warning: no previously-included files matching '.git*' found anywhere in distribution
    warning: no previously-included files matching '*.pyc' found anywhere in distribution
    warning: no previously-included files matching '*.so' found anywhere in distribution
    no previously-included directories found matching '.ci'
    writing manifest file 'src/Pillow.egg-info/SOURCES.txt'
    running build_ext

    The headers or library files could not be found for zlib,
    a required dependency when compiling Pillow from source.

    Please see the install instructions at:
       https://pillow.readthedocs.io/en/latest/installation.html

    Traceback (most recent call last):
      File "/tmp/pip-build-d58guygh/pillow/setup.py", line 1024, in <module>
        zip_safe=not (debug_build() or PLATFORM_MINGW),
      File "/usr/lib/python3/dist-packages/setuptools/__init__.py", line 129, in setup
        return distutils.core.setup(**attrs)
      File "/usr/lib/python3.6/distutils/core.py", line 148, in setup
        dist.run_commands()
      File "/usr/lib/python3.6/distutils/dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib/python3/dist-packages/setuptools/command/install.py", line 61, in run
        return orig.install.run(self)
      File "/usr/lib/python3.6/distutils/command/install.py", line 589, in run
        self.run_command('build')
      File "/usr/lib/python3.6/distutils/cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib/python3.6/distutils/command/build.py", line 135, in run
        self.run_command(cmd_name)
      File "/usr/lib/python3.6/distutils/cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
        cmd_obj.run()
      File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 78, in run
        _build_ext.run(self)
      File "/usr/lib/python3.6/distutils/command/build_ext.py", line 339, in run
        self.build_extensions()
      File "/tmp/pip-build-d58guygh/pillow/setup.py", line 790, in build_extensions
        raise RequiredDependencyException(f)
    __main__.RequiredDependencyException: zlib

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-d58guygh/pillow/setup.py", line 1037, in <module>
        raise RequiredDependencyException(msg)
    __main__.RequiredDependencyException:

    The headers or library files could not be found for zlib,
    a required dependency when compiling Pillow from source.

    Please see the install instructions at:
       https://pillow.readthedocs.io/en/latest/installation.html

    ----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-7dqaewmq/pillow/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-eet3g51z-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-7dqaewmq/pillow/
The command '/bin/sh -c pip3 install --no-cache-dir --trusted-host pypi.python.org h5py==2.8.0     imageio==2.4.1     numpy==1.14.3     opencv-python==3.4.2.17     pandas==0.22.0     scipy==1.0.0     torch==1.1.0     torchvision==0.2.2     tqdm==4.19.8     matplotlib==3.0.2' returned a non-zero code: 1
Model extra arguments:
 - MODEL_INDEX=1
 - NUM_PROCESSES=10
 - PREPROCESS_FLAG=True

Starting docker container for nyu_gmic model.
Unable to find image 'nyu_gmic:latest' locally
docker: Error response from daemon: pull access denied for nyu_gmic, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.

Running predict.sh for nyu_gmic model.
Error response from daemon: Container ce0338906b15f1ed3eb9c2aa590c2b71e85d7d27a814525a59df1779141698db is not running
Error: No such container:path: ce0338906b15:/home/predictions/nyu_gmic_experiment01_predictions.csv
ce0338906b15
ce0338906b15

Evaluating.
Traceback (most recent call last):
  File "/home/chikara/Documents/Published/mammogram/mammography_metarepository/./evaluation/score.py", line 191, in <module>
    main(sys.argv[1], sys.argv[2], sys.argv[3])
  File "/home/chikara/Documents/Published/mammogram/mammography_metarepository/./evaluation/score.py", line 180, in main
    breast_or_image = breast_or_image_level(prediction_file)
  File "/home/chikara/Documents/Published/mammogram/mammography_metarepository/./evaluation/score.py", line 15, in breast_or_image_level
    df = pd.read_csv(prediction_file, header=0)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 482, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 811, in __init__
    self._engine = self._make_engine(self.engine)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in __init__
    self._open_handles(src, kwds)
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/parsers/base_parser.py", line 222, in _open_handles
    self.handles = get_handle(
  File "/home/chikara/.local/lib/python3.9/site-packages/pandas/io/common.py", line 702, in get_handle
    handle = open(
FileNotFoundError: [Errno 2] No such file or directory: '/home/chikara/Documents/Published/mammogram/mammography_metarepository/predictions/nyu_gmic_experiment01_predictions.csv'
jwitos commented 3 years ago

thanks @Ch1kara, can you also let us know your setup params? E.g. your OS and version, Docker version and Python version?

cc @chledowski

cyh-0 commented 3 years ago

bash run.sh nyu_gmic experiment01 /mnt/HD/Dataset/CMMD/processed/IMAGE_16 sample_data/preprocessed_images/ sample_data/cmmd_datalist.pkl predictions/ gpu 0 no_bootstrap

My OS version is Ubuntu 20.04, Python 3.9.6, Docker 20.10.10

I think I may have created an empty .pkl for the data, let me try to do it again.

cyh-0 commented 3 years ago

Yeah, I have tried again and the issues remain the same.

chledowski commented 3 years ago

Hi!

Looks like zlib1g-dev and libjpeg-dev were missing in the dockerfiles. Even though previously they were not needed.

Please check if using branch fix-pillow (https://github.com/nyukat/mammography_metarepository/tree/fix-pillow) solves the problem :)

cyh-0 commented 3 years ago

Thanks for helping! It works, but not I have a CUDA error...

Running predict.sh for nyu_gmic model.
Stage 1: Crop Mammograms
Stage 2: Extract Centers
Stage 3: Run Classifier
  0%|                                                                                                                                                                                        | 0/4 [00:00<?, ?it/s]THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=8 : invalid device function

Traceback (most recent call last):
  File "src/scripts/run_model.py", line 298, in <module>
    main()
  File "src/scripts/run_model.py", line 294, in main
    turn_on_visualization=args.visualization_flag,
  File "src/scripts/run_model.py", line 254, in start_experiment
    output_df = run_single_model(single_model_path, data_path, parameters, turn_on_visualization)
  File "src/scripts/run_model.py", line 219, in run_single_model
    output_df = run_model(model, exam_list, parameters, turn_on_visualization)
  File "src/scripts/run_model.py", line 182, in run_model
    output = model(tensor_batch)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/gmic/GMIC/src/modeling/gmic.py", line 116, in forward
    h_g, self.saliency_map = self.global_network.forward(x_original)
  File "/home/gmic/GMIC/src/modeling/modules.py", line 289, in forward
    last_feature_map = self.downsampling_branch.forward(x)
  File "/home/gmic/GMIC/src/modeling/modules.py", line 156, in forward
    h = layer(h)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/gmic/GMIC/src/modeling/modules.py", line 56, in forward
    out = self.relu(out)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/activation.py", line 99, in forward
    return F.relu(input, inplace=self.inplace)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py", line 941, in relu
    result = torch.relu_(input)
RuntimeError: CUDA error: no kernel image is available for execution on the device
Copying results into the output csv file.
cp: cannot stat '/home/preprocessed_images/nyu_gmic_experiment01_sample_output/predictions.csv': No such file or directory
Error: No such container:path: 7aa61462aee5:/home/predictions/nyu_gmic_experiment01_predictions.csv
7aa61462aee5
7aa61462aee5
cyh-0 commented 3 years ago

Do you think this could be related to the CUDA version? I am currently using 11.3

chledowski commented 3 years ago

If I understand correctly, CUDA is backward compatible. So if you have 11.3 and the code uses 10.1, you should be able to run the code.

Can you check if it works on CPU?

jwitos commented 3 years ago

@chledowski @Ch1kara I checked the GMIC model on an AWS instance that was running CUDA 11.3, Docker 20.10.9 and Ubuntu 20. It worked fine with sample data, so no issues with CUDA/Docker/OS versioning here.

@Ch1kara my suggestion is that you try to investigate your docker+gpu setup locally. It seems like a local installation problem, but I'll keep this ticket open for a while in case you have further issues.

@chledowski I will merge #19 as well.

cyh-0 commented 3 years ago

I can run the code with CPU now but on GPU the CUDA error still remains. Yeah, the CUDA error is related to the torch version. I need a torch version > 1.8.0 to run on my 3090. I can successfully reproduce the result for GMIC. Thank you guys again for helping :smiley: .

jwitos commented 3 years ago

great news @Ch1kara! One last thing: I noticed you said you flipped some CMMD images. I don't think any of the images should be flipped there, but let me know if you think otherwise? If you have this or other issues with CMMD data set pls feel free to open a new ticket. Thanks!

cyh-0 commented 3 years ago

Yeah, I also notice flipping would cause trouble. Anyway, thank you and have a nice day :smiley: !