RosettaCommons / RoseTTAFold

This package contains deep learning models and related scripts for RoseTTAFold
MIT License
2.03k stars 440 forks source link

Bus error (core dumped) while running predict_e2e.py #137

Open michaelhsieh42 opened 1 year ago

michaelhsieh42 commented 1 year ago

Hi, I am attempting to run run_e2e_ver.sh with a fasta file. It could never get pass predict_e2e.py. It always throw a bus error with on a GPU instance (AWS' g5.2xlarge and g5.4xlarge). I used faulthandler to show traceback before the core dump. It appears that the issue lies at read_entry_lines in network/ffindex.py while reading the templates. Has anyone experienced this that could provide some insight? Thanks.

+ python -X faulthandler /app/RoseTTAFold/network/predict_e2e.py -m /app/RoseTTAFold/weights -i /opt/ml/output/t000_.msa0.a3m -o /opt/ml/output/t000_.e2e --hhr /opt/ml/output/t000_.hhr --atab /opt/ml/output/t000_.atab --db /app/RoseTTAFold/pdb100_2021Mar03/pdb100_2021Mar03
DGL backend not selected or invalid.  Assuming PyTorch for now.
Setting the default backend to "pytorch". You can change it in the ~/.dgl/config.json file or export the DGLBACKEND environment variable.  Valid options are: pytorch, mxnet, tensorflow (all lowercase)
Using backend: pytorch
Fatal Python error: Bus error

Current thread 0x00007f9ff77d40c0 (most recent call first):
  File "/app/RoseTTAFold/network/ffindex.py", line 46 in read_entry_lines
  File "/app/RoseTTAFold/network/parsers.py", line 207 in parse_templates
  File "/app/RoseTTAFold/network/parsers.py", line 241 in read_templates
  File "/app/RoseTTAFold/network/predict_e2e.py", line 120 in predict
  File "/app/RoseTTAFold/network/predict_e2e.py", line 324 in <module>
./docker/run_rosettafold.sh: line 110:   557 Bus error               (core dumped) python -X faulthandler $PIPEDIR/network/predict_e2e.py -m $PIPEDIR/weights -i $WDIR/t000_.msa0.a3m -o $WDIR/t000_.e2e --hhr $WDIR/t000_.hhr --atab $WDIR/t000_.atab --db $DB

This is the environment (Dockerfile) I run the script

ARG CUDA=11.1.1
FROM nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu18.04

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
        build-essential \
        git \
        tzdata \
        wget \
        unzip \
    && apt-get autoremove -y \
    && apt-get clean

## AWSCLI
RUN wget -q -P /opt/ https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip \
    && unzip /opt/awscli-exe-linux-x86_64.zip -d /opt/ \
    && /opt/aws/install \
    && rm -f /opt/awscli-exe-linux-x86_64.zip

RUN wget -q -P /tmp \
  https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
    && bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda \
    && rm -f /tmp/Miniconda3-latest-Linux-x86_64.sh

ENV PATH="/opt/conda/bin:$PATH" # so that conda command is available.

RUN git clone -b v1.1.0 --single-branch https://github.com/RosettaCommons/RoseTTAFold.git /app/RoseTTAFold

RUN conda env create -f /app/RoseTTAFold/RoseTTAFold-linux.yml

RUN wget https://files.ipd.uw.edu/pub/RoseTTAFold/weights.tar.gz \
    && tar xfz weights.tar.gz -C /app/RoseTTAFold/ \
    && rm weights.tar.gz -f

RUN bash /app/RoseTTAFold/install_dependencies.sh \
    && rm -f lddt.zip csblast-2.2.3.tar.gz

RUN mv /lddt/ /csblast-2.2.3/ /app/RoseTTAFold/