UMEssen / Body-and-Organ-Analysis

BOA is a segmentation tool of CT scans by the SHIP-AI group (https://ship-ai.ikim.nrw/). Combining the TotalSegmentator and the Body Composition Analysis, this tool is capable of analyzing medical images and identifying the different structures within the human body, including bones, muscles, organs, and blood vessels.
Apache License 2.0
20 stars 5 forks source link

Failed to download weights due to Zenodo connection #12

Closed yayapa closed 7 months ago

yayapa commented 8 months ago

Hello, thank you for the great work!

I experience a download error (for the totalsegmentator weights) after running docker run:

INFO:body_organ_analysis.commands:Image loaded and retrieved: DONE in 0.00003s INFO:body_organ_analysis.compute.inference:Input image: /image.nii.gz INFO:body_organ_analysis.compute.inference:Image size: (122, 101, 112) INFO:body_organ_analysis.compute.inference:Image dtype: int32 INFO:body_organ_analysis.compute.inference:Voxel spacing: (3.0, 3.0, 3.0) INFO:body_organ_analysis.compute.inference:Input Axcodes: ('R', 'A', 'S') WARNING:body_organ_analysis.compute.inference:Unexpected CT values found in input image: got -1207.0-3382.0, expected -1024-3071. The values have been clipped to the expected range. Please check the segmentations to ensure that everything is correct. INFO:body_organ_analysis.compute.inference:Computing segmentations for task total INFO:totalsegmentator.libs:Downloading pretrained weights for Task 251... Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/app/body_organ_analysis/main.py", line 4, in run() File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/cli.py", line 168, in run analyze_ct( File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/commands.py", line 76, in analyze_ct compute_all_models( File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/compute/inference.py", line 123, in compute_all_models download_pretrained_weights(tid) File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/_external/totalsegmentator/libs.py", line 192, in download_pretrained_weights download_url_and_unpack(WEIGHTS_URL, config_dir) File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/_external/totalsegmentator/libs.py", line 77, in download_url_and_unpack raise e File "/usr/local/lib/python3.8/dist-packages/body_organ_analysis/_external/totalsegmentator/libs.py", line 64, in download_url_and_unpack r.raise_for_status() File "/usr/local/lib/python3.8/dist-packages/requests/models.py", line 1021, in raise_for_status raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://zenodo.org/record/6802342/files/Task251_TotalSegmentator_part1_organs_1139subj.zip?download=1

I checked the link and it works. This can happen due to an unstable connection on the server side and reaching timeout. One of the possible solutions can be the increasing response waiting time. Or can you please provide the file structure that you expect in the weights directory, to have an opportunity to download the weights manually?

Thank you in advance.

giuliabaldini commented 8 months ago

Hey there!

the file structure should be as follows:

weights/
    nnunet/
        2d/
        3d_fullres/
            Task251_TotalSegmentator_part1_organs_1139subj/
                nnUNetTrainerV2_ep4000_nomirror__nnUNetPlansv2.1/
            ...
    nnUNet_cropped_data/
    nnUNet_raw_data/

image

You can specify the weights path with LOCAL_WEIGHTS_PATH in .env. The nnUNet_cropped_data, nnUNet_raw_data and 2d folders are empty but they should be there. Please let me know if that works out.

If you can wait until Friday though, I also plan to release already built versions of the docker images.

yayapa commented 7 months ago

Hey there!

the file structure should be as follows:

weights/
    nnunet/
        2d/
        3d_fullres/
            Task251_TotalSegmentator_part1_organs_1139subj/
                nnUNetTrainerV2_ep4000_nomirror__nnUNetPlansv2.1/
            ...
    nnUNet_cropped_data/
    nnUNet_raw_data/

image

You can specify the weights path with LOCAL_WEIGHTS_PATH in .env. The nnUNet_cropped_data, nnUNet_raw_data and 2d folders are empty but they should be there. Please let me know if that works out.

If you can wait until Friday though, I also plan to release already built versions of the docker images.

Hi @giuliabaldini! Thank you for the fast answer! Yes, I could wait until Friday. [Q1]: Will the new image include the weights already? If not, this will not solve my problem, I suppose.

I actually already built my own image but without weights. This error happens when I am trying to start docker run with the following script saved in run.sh:

#!/bin/bash

# Define variables
INPUT_FILE=/home/dmitrii/GitHub/Body-and-Organ-Analysis/image.nii.gz
WORKING_DIR=/home/dmitrii/GitHub/Body-and-Organ-Analysis
LOCAL_WEIGHTS_PATH=/home/dmitrii/GitHub/Body-and-Organ-Analysis

# Check if the input file exists
if [ ! -f "$INPUT_FILE" ]; then
    echo "Error: Input file does not exist."
    exit 1
fi

# Check if the directories exist
if [ ! -d "$WORKING_DIR" ] || [ ! -d "$LOCAL_WEIGHTS_PATH" ]; then
    echo "Error: One or more directories do not exist."
    exit 1
fi

# Docker run command
docker run \
    --rm \
    -it \
    -v $INPUT_FILE:/image.nii.gz \
    -v $WORKING_DIR:/workspace/ \
    -v $LOCAL_WEIGHTS_PATH:/weights \
    --gpus all \
    --env NVIDIA_VISIBLE_DEVICES=all \
    --network host \
    --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 \
    --entrypoint /bin/sh \
    ship-ai/boa-cli \
    -c \
    "python body_organ_analysis --input-image /image.nii.gz --output-dir /workspace/ --models total+bca --verbose"

So I downloaded for testing purposes the first task 251 and saved this in the file structure as you provided

image

Unfortunately, I am still experiencing the same error, although I expected that the next error corresponding to the next task should occur. [Q2]: Am I wrong and should I download the next weights so that it works properly? [Q3]: Or some checks in code are missing and the problem cannot be solved manually, as I am trying to do?

The error message:
image

I will be happy to get any help from you. Thank you!

yayapa commented 7 months ago

I also tried to use an external script provided by you in _body_organ_analysis/_external/totalsegmentator/download_pretrainedweights.py.

I experimented with different solutions:

  1. timeout, _allowsredirect and headers parameters, but, unfortunately, it did not help.
  2. I tested different internet connections, and as said above, I can download this manually, but not with python.
  3. I checked the status with curl -v -O https://zenodo.org/record/6802342/files/Task251_TotalSegmentator_part1_organs_1139subj.zip?download=1. The output from curl indicates that the server at zenodo.org is responding with a 301 MOVED PERMANENTLY status code. This means the URL you are trying to access has been permanently moved to a new location. I tried to replace this, but it did not work either.

So, currently, I do not have any ideas, on how to get the downloading to work programmatically. I hope, this information will help you either.

giuliabaldini commented 7 months ago

Hi!

[Q1] Yes, the docker images will already contain the weights!

I think you are not doing anything wrong with the weights, I think that the paths are not setup properly and the weights cannot be found. Can you show me how you set LOCAL_WEIGHTS_PATH? Is there a .env file?

yayapa commented 7 months ago

Hi!

[Q1] Yes, the docker images will already contain the weights!

I think you are not doing anything wrong with the weights, I think that the paths are not setup properly and the weights cannot be found. Can you show me how you set LOCAL_WEIGHTS_PATH? Is there a .env file?

Now, I realize that I do not have any .env file and I built my docker image without this (so I had only .env_sample when I was building this). Is it critical?

Because, on the other hand, I thought, I could provide this in the run configuration as I showed above in the run.sh script. LOCAL_WEIGHTS_PATH=/home/dmitrii/GitHub/Body-and-Organ-Analysis and then -v $LOCAL_WEIGHTS_PATH:/weights \

giuliabaldini commented 7 months ago

I think it should be LOCAL_WEIGHTS_PATH=/home/dmitrii/GitHub/Body-and-Organ-Analysis/weights :)

yayapa commented 7 months ago

I think it should be LOCAL_WEIGHTS_PATH=/home/dmitrii/GitHub/Body-and-Organ-Analysis/weights :)

I tried it out in run.sh, and it still tries to download the weights: INFO:totalsegmentator.libs:Downloading pretrained weights for Task 251...

giuliabaldini commented 7 months ago

Sorry, I should have looked at your script better.

#!/bin/bash

# Define variables
INPUT_FILE=/home/dmitrii/GitHub/Body-and-Organ-Analysis/image.nii.gz
WORKING_DIR=/home/dmitrii/GitHub/Body-and-Organ-Analysis
LOCAL_WEIGHTS_PATH=/home/dmitrii/GitHub/Body-and-Organ-Analysis/weights

# Check if the input file exists
if [ ! -f "$INPUT_FILE" ]; then
  echo "Error: Input file does not exist."
  exit 1
fi

# Check if the directories exist
if [ ! -d "$WORKING_DIR" ] || [ ! -d "$LOCAL_WEIGHTS_PATH" ]; then
  echo "Error: One or more directories do not exist."
  exit 1
fi

# Docker run command
docker run \
  --rm \
  -it \
  -v $INPUT_FILE:/image.nii.gz \
  -v $WORKING_DIR:/workspace/ \
  -v $LOCAL_WEIGHTS_PATH:/app/weights \
  --gpus all \
  --env NVIDIA_VISIBLE_DEVICES=all \
  --network host \
  --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 \
  --entrypoint /bin/sh \
  ship-ai/boa-cli \
  -c \
  "python body_organ_analysis --input-image /image.nii.gz --output-dir /workspace/ --models total+bca --verbose"

$LOCAL_WEIGHTS_PATH:/weights this should be this $LOCAL_WEIGHTS_PATH:/app/weights, that's where the folder is mounted in the image.

giuliabaldini commented 7 months ago

Also, if you do it like this, the outputs will directly be generated in your Body-and-Organ-Analysis folder, it would probably better to use something like python body_organ_analysis --input-image /image.nii.gz --output-dir /workspace/output --models total+bca --verbose, so that the folder will be created for you and you will find your outputs there.

yayapa commented 7 months ago

Super! This has worked! Thank you very much! (So, I downloaded the weights manually). And maybe one last question: is there any possibility of processing a batch of CTs (niftis/dicoms)? Otherwise, I can surely write a script on top, but maybe I have overseen something

giuliabaldini commented 7 months ago

I have added a couple of scripts of how we do it for multiple folders: https://github.com/UMEssen/Body-and-Organ-Analysis/tree/main/example_scripts

You will have to fix the process_file.sh script to fit your output folder structure (whether you want the outputs to be stored in a folder with the same name as the input image, or if you want it to have the same name as the folder the image is stored in). Also, you will have to add your weights path. In process_lib.sh you just have to put the paths to the folders.

Tell me if that works!

mingrisch commented 7 months ago

Thank you for this effort Giulia! We can confirm this issue. Apparently, the Zenodo URLs have changed and redirect, and the weight download fails with HTTP 404.

giuliabaldini commented 7 months ago

Thank you for reporting, I'll look at this tomorrow and fix it!

yayapa commented 7 months ago

I have added a couple of scripts of how we do it for multiple folders: https://github.com/UMEssen/Body-and-Organ-Analysis/tree/main/example_scripts

You will have to fix the process_file.sh script to fit your output folder structure (whether you want the outputs to be stored in a folder with the same name as the input image, or if you want it to have the same name as the folder the image is stored in). Also, you will have to add your weights path. In process_lib.sh you just have to put the paths to the folders.

Tell me if that works!

Thanks! It was easier for me to write my script for the special folder structure I have (a similar structure was in the BCA algorithm). So, if you have the dataset in the $BASE_DIR with inputs in batch_input and want the outputs in batch_output. Illustration:

|-BASE_DIR
| |-batch_input
| | |-0
| | | |-image.nii.gz
| | |-1
| | | |-image.nii.gz
...
| |-batch_output
| | |-0
| | | |-output files
| | |-1
| | | |-output files

You can use the following script (maybe it will help somebody either):

#!/bin/bash

# Base directories
BASE_DIR="/path/to/dataset/"
INPUT_DIR="$BASE_DIR/batch_input/"
WORKING_DIR="$BASE_DIR/batch_output/"
LOCAL_WEIGHTS_PATH="/path/to/weights"

# Check if the directories exist
if [ ! -d "$INPUT_DIR" ] || [ ! -d "$WORKING_DIR" ] || [ ! -d "$LOCAL_WEIGHTS_PATH" ]; then
  echo "Error: One or more directories do not exist."
  exit 1
fi

# Loop through each subdirectory in INPUT_DIR
for subdir in "$INPUT_DIR"/*; do
    if [ -d "$subdir" ]; then
        SUBDIR_NAME=$(basename "$subdir")
        INPUT_FILE="$subdir/image.nii.gz"
        OUTPUT_SUBDIR="$WORKING_DIR/$SUBDIR_NAME"

        # Check if the input file exists
        if [ ! -f "$INPUT_FILE" ]; then
            echo "Error: Input file $INPUT_FILE does not exist."
            continue
        fi

        # Create the output directory if it doesn't exist
        mkdir -p "$OUTPUT_SUBDIR"

        # Docker run command
        docker run \
          --rm \
          -it \
          -v "$INPUT_FILE":/input/image.nii.gz \
          -v "$OUTPUT_SUBDIR":/workspace/outputs/ \
          -v "$LOCAL_WEIGHTS_PATH":/app/weights \
          --gpus all \
          --env NVIDIA_VISIBLE_DEVICES=all \
          --network host \
          --shm-size=8g --ulimit memlock=-1 --ulimit stack=67108864 \
          --entrypoint /bin/sh \
          ship-ai/boa-cli \
          -c \
          "python body_organ_analysis --input-image /input/image.nii.gz --output-dir /workspace/outputs/ --models total+bca --verbose --radiomics"
    fi
done

(I also use --gpus all instead of --runtume=nvidia, since I have a newer version of docker container)