cyber-carpentry / breakfast_carpentry

Suggestions
1 stars 0 forks source link

Singularity for running deep learning codes on GPU #10

Open guchchcug opened 4 years ago

guchchcug commented 4 years ago

I'd like to learn using Singularity to run deep learning codes with GPU on XSEDE, and with multi-CPUs on jet streams. I have at least MNIST codes ready for that testing.

agladstein commented 4 years ago

Hi @guchchcug, I've used multi-cpus on Jetstream, transferred data to Bridges, and run deep learning on gpu on Bridges in Singularity container. I could come during lunch or an evening to go over it.

do you have a GPU XSEDE resource to play on?

  1. I created a singularity image (tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg) with the Singularity recipe:
    
    Bootstrap: docker
    From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"


2. Transferred Singularity image to Bridges.

3. Start interactive GPU-AI job https://www.psc.edu/bridges/user-guide/running-jobs#ai

interact -p GPU-AI -A mc5phjp --gres=gpu:volta16:1

where `mc5phjp` is my project ID

4. Start environment

module load cuda/9.0 module load singularity/3.0.0 singularity shell --nv -B $SCRATCH tensorflow1.9.0-py3-cuda9.0-ubuntu16.04.simg


5. Test deep learning

python my_deep_learning_code.py



Or alternatively, use similar steps to submit a job with slurm.
guchchcug commented 4 years ago

Hi Ariella,

Thank you for the information! That’s great if you can come for help!

I have GPU XSEDED resources on both comet and bridge. I’ll go through the procedures myself first and hope to go over it with you when you come! Thank you! :)

Chen

On Jul 24, 2019, at 10:15 AM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Hi @guchchcughttps://github.com/guchchcug, I've used multi-cpus on Jetstream, transferred data to Bridges, and run deep learning on gpu on Bridges in Singularity container. I could come during lunch or an evening to go over it.

do you have a GPU XSEDE resource to play on?

  1. I created a singularity image (tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg) with the Singularity recipe:

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

  1. Transferred Singularity image to Bridges.

  2. Start interactive GPU-AI job https://www.psc.edu/bridges/user-guide/running-jobs#ai

interact -p GPU-AI -A mc5phjp --gres=gpu:volta16:1

where mc5phjp is my project ID

  1. Start environment

module load cuda/9.0 module load singularity/3.0.0 singularity shell --nv -B $SCRATCH tensorflow1.9.0-py3-cuda9.0-ubuntu16.04.simg

  1. Test deep learning

python my_deep_learning_code.py

Or alternatively, use similar steps to submit a job with slurm.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWK4GR5DRMH35SQ6CTTQBBPX5A5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WPC3A#issuecomment-514650476, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWIIMXMIJ6A4J5CNUO3QBBPX5ANCNFSM4IF7KMQA.

guchchcug commented 4 years ago

Hi Ariella,

I got the following error when I run the singularity file under my bridge account.

[guchch@gpu051 ~]$ singularity shell --nv -B $SCRATCH my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg

FATAL: image format not recognized ERROR : Child exit with status 255

Chen ps. The my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg file is as follows: Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

On Jul 24, 2019, at 10:40 AM, Chen Gu guchch@mit.edu<mailto:guchch@mit.edu> wrote:

Hi Ariella,

Thank you for the information! That’s great if you can come for help!

I have GPU XSEDED resources on both comet and bridge. I’ll go through the procedures myself first and hope to go over it with you when you come! Thank you! :)

Chen

On Jul 24, 2019, at 10:15 AM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Hi @guchchcughttps://github.com/guchchcug, I've used multi-cpus on Jetstream, transferred data to Bridges, and run deep learning on gpu on Bridges in Singularity container. I could come during lunch or an evening to go over it.

do you have a GPU XSEDE resource to play on?

  1. I created a singularity image (tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg) with the Singularity recipe:

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

  1. Transferred Singularity image to Bridges.

  2. Start interactive GPU-AI job https://www.psc.edu/bridges/user-guide/running-jobs#ai

interact -p GPU-AI -A mc5phjp --gres=gpu:volta16:1

where mc5phjp is my project ID

  1. Start environment

module load cuda/9.0 module load singularity/3.0.0 singularity shell --nv -B $SCRATCH tensorflow1.9.0-py3-cuda9.0-ubuntu16.04.simg

  1. Test deep learning

python my_deep_learning_code.py

Or alternatively, use similar steps to submit a job with slurm.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWK4GR5DRMH35SQ6CTTQBBPX5A5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WPC3A#issuecomment-514650476, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWIIMXMIJ6A4J5CNUO3QBBPX5ANCNFSM4IF7KMQA.

agladstein commented 4 years ago

hmmm, I just double checked on Bridges and it worked for me. Did you load the singularity module? What version of singularity did you build the image with? Did you test the singularity image where ever you originally built it? Also, I can just pass you my image...

guchchcug commented 4 years ago

Hi Ariella,

I used what you suggested:

module load cuda/9.0 module load singularity/3.0.0

Chen

Chen Gu, Ph.D. Postdoctoral Associate Earth Resources Laboratory Department of Earth, Atmospheric and Planetary Sciences Massachusetts Institute of Technology 77 Mass Ave., 54-617 Cambridge, MA 02139 Telephone: 617-253-7278 Cell: 617-416-6058 Email: guchch@mit.edumailto:guchch@mit.edu Website: http://www.chenguatmit.com

On Jul 24, 2019, at 4:24 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

hmmm, I just double checked on Bridges and it worked for me. Did you load the singularity module? What version of singularity did you build the image with? Did you test the singularity image where ever you originally built it? Also, I can just pass you my image...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWNA3C2RJOVBB7Q7GDLQBC3ATA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XQHIA#issuecomment-514786208, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWOSIKWFZDK6CPO4RM3QBC3ATANCNFSM4IF7KMQA.

agladstein commented 4 years ago

Did you test your image in the environment you originally created it?

guchchcug commented 4 years ago

I’m not sure if I created a image. I just put all the following text in a file named "my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg”.

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

Chen

On Jul 24, 2019, at 4:39 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Did you test your image in the environment you originally created it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWP4OFB7WSZVYCTTNKLQBC4XFA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XRNQA#issuecomment-514791104, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWJ7BO3S62GAC2HWVTLQBC4XFANCNFSM4IF7KMQA.

agladstein commented 4 years ago

Ah. Yes, that’s why it doesn’t think it’s a singularity image. You need to build the singularity image using that recipe, just like we built a docker image from a dockerfile. I’m not a computer right now, so I can’t send you commands to build it. Ask somebody how to build singularity image on Jetstream, then copy your image to bridges. Or I can send you my image just for you to try it.

On Wed, Jul 24, 2019 at 4:47 PM guchchcug notifications@github.com wrote:

I’m not sure if I created a image. I just put all the following text in a file named "my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg”.

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

Chen

On Jul 24, 2019, at 4:39 PM, Ariella Gladstein <notifications@github.com mailto:notifications@github.com> wrote:

Did you test your image in the environment you originally created it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWP4OFB7WSZVYCTTNKLQBC4XFA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XRNQA#issuecomment-514791104>, or mute the thread< https://github.com/notifications/unsubscribe-auth/ACUUFWJ7BO3S62GAC2HWVTLQBC4XFANCNFSM4IF7KMQA>.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=AGLCJ7SNUCXWH75EZH2VTCTQBC5WJA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XSDSA#issuecomment-514793928, or mute the thread https://github.com/notifications/unsubscribe-auth/AGLCJ7V22DILXZVQRKVLVI3QBC5WJANCNFSM4IF7KMQA .

guchchcug commented 4 years ago

Hi Ariella,

Thank you! I’ll build my singularity. Could you also send me your image so that I can try that?

Chen

On Jul 24, 2019, at 4:58 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Ah. Yes, that’s why it doesn’t think it’s a singularity image. You need to build the singularity image using that recipe, just like we built a docker image from a dockerfile. I’m not a computer right now, so I can’t send you commands to build it. Ask somebody how to build singularity image on Jetstream, then copy your image to bridges. Or I can send you my image just for you to try it.

On Wed, Jul 24, 2019 at 4:47 PM guchchcug notifications@github.com<mailto:notifications@github.com> wrote:

I’m not sure if I created a image. I just put all the following text in a file named "my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg”.

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

Chen

On Jul 24, 2019, at 4:39 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com mailto:notifications@github.com> wrote:

Did you test your image in the environment you originally created it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWP4OFB7WSZVYCTTNKLQBC4XFA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XRNQA#issuecomment-514791104>, or mute the thread< https://github.com/notifications/unsubscribe-auth/ACUUFWJ7BO3S62GAC2HWVTLQBC4XFANCNFSM4IF7KMQA>.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=AGLCJ7SNUCXWH75EZH2VTCTQBC5WJA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XSDSA#issuecomment-514793928, or mute the thread https://github.com/notifications/unsubscribe-auth/AGLCJ7V22DILXZVQRKVLVI3QBC5WJANCNFSM4IF7KMQA .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWMGCOEWRRQ7SHMIHW3QBC663A5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XS7LQ#issuecomment-514797486, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWOEY5XIGIHRXSPWHP3QBC663ANCNFSM4IF7KMQA.

guchchcug commented 4 years ago

Hi Ariella,

I found the procedures for building my singularity image: https://github.com/guchchcug/container_camp_workshop_2019/blob/master/singularity/singularityadvanced.rst

Chen

Chen Gu, Ph.D. Postdoctoral Associate Earth Resources Laboratory Department of Earth, Atmospheric and Planetary Sciences Massachusetts Institute of Technology 77 Mass Ave., 54-617 Cambridge, MA 02139 Telephone: 617-253-7278 Cell: 617-416-6058 Email: guchch@mit.edumailto:guchch@mit.edu Website: http://www.chenguatmit.com

On Jul 24, 2019, at 4:59 PM, Chen Gu guchch@mit.edu<mailto:guchch@mit.edu> wrote:

Hi Ariella,

Thank you! I’ll build my singularity. Could you also send me your image so that I can try that?

Chen

On Jul 24, 2019, at 4:58 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Ah. Yes, that’s why it doesn’t think it’s a singularity image. You need to build the singularity image using that recipe, just like we built a docker image from a dockerfile. I’m not a computer right now, so I can’t send you commands to build it. Ask somebody how to build singularity image on Jetstream, then copy your image to bridges. Or I can send you my image just for you to try it.

On Wed, Jul 24, 2019 at 4:47 PM guchchcug notifications@github.com<mailto:notifications@github.com> wrote:

I’m not sure if I created a image. I just put all the following text in a file named "my_first_tensorflow1.12.0-py3-cuda9.0-ubuntu16.04.simg”.

Bootstrap: docker From: nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04

%post

Updating and getting required packages

apt-get -y update apt-get -y upgrade apt-get install -y wget git vim python3 python3-pip python3-tk ln -s /usr/bin/python3 /usr/bin/python

Download and install Anaconda

CONDA_INSTALL_PATH="/usr/local/anaconda3-4.2.0" wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh chmod +x Anaconda3-4.2.0-Linux-x86_64.sh ./Anaconda3-4.2.0-Linux-x86_64.sh -b -p $CONDA_INSTALL_PATH

export LC_ALL=C

Install Tensorflow

pip3 install tensorflow-gpu==1.12.0 pip3 install tensorflow-probability

Install other python modules

pip3 install scipy pip3 install matplotlib==2.2.4 pip3 install sklearn pip3 install pandas pip3 install Pillow pip3 install livelossplot pip3 install hyperas pip3 install GPy pip3 install GPyOpt pip3 install blinker pip3 install psutil pip3 install spacy

Install Keras

pip3 install keras

Run command defined in command line

%runscript exec "$@"

Chen

On Jul 24, 2019, at 4:39 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com mailto:notifications@github.com> wrote:

Did you test your image in the environment you originally created it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub< https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWP4OFB7WSZVYCTTNKLQBC4XFA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XRNQA#issuecomment-514791104>, or mute the thread< https://github.com/notifications/unsubscribe-auth/ACUUFWJ7BO3S62GAC2HWVTLQBC4XFANCNFSM4IF7KMQA>.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=AGLCJ7SNUCXWH75EZH2VTCTQBC5WJA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XSDSA#issuecomment-514793928, or mute the thread https://github.com/notifications/unsubscribe-auth/AGLCJ7V22DILXZVQRKVLVI3QBC5WJANCNFSM4IF7KMQA .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWMGCOEWRRQ7SHMIHW3QBC663A5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2XS7LQ#issuecomment-514797486, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWOEY5XIGIHRXSPWHP3QBC663ANCNFSM4IF7KMQA.

agladstein commented 4 years ago

You can also try a singularity image from singularity hub. I just tried out a random image tagged tensorflow on singularity hub (https://singularity-hub.org/search)

on the bridges login node,

module load singularity/3.0.0

pull a tensorflow singularity image,

singularity pull shub://belledon/tensorflow-keras

start an interactive job

interact -p GPU-AI -A mc5phjp --gres=gpu:volta16:1

load necessary modules

module load singularity/3.0.0
module load cuda/9.0

Enter singularity image

singularity shell --nv -B $SCRATCH tensorflow-keras_latest.sif

Test if keras imports

Singularity tensorflow-keras_latest.sif:~> python
Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import keras
Using TensorFlow backend.

I did not verify that it uses the gpu, but I suspect it will.

guchchcug commented 4 years ago

Hi Ariella,

I still get the following errors:

Singularity tensorflow-keras_latest.sif:/pylon5/ea5phhp/guchch/my_singularity> python MNIST-scratch.py Using TensorFlow backend. Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz Traceback (most recent call last): File "/usr/lib/python3.5/urllib/request.py", line 1254, in do_open h.request(req.get_method(), req.selector, req.data, headers) File "/usr/lib/python3.5/http/client.py", line 1106, in request self._send_request(method, url, body, headers) File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request self.endheaders(body) File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders self._send_output(message_body) File "/usr/lib/python3.5/http/client.py", line 934, in _send_output self.send(msg) File "/usr/lib/python3.5/http/client.py", line 877, in send self.connect() File "/usr/lib/python3.5/http/client.py", line 1252, in connect super().connect() File "/usr/lib/python3.5/http/client.py", line 849, in connect (self.host,self.port), self.timeout, self.source_address) File "/usr/lib/python3.5/socket.py", line 711, in create_connection raise err File "/usr/lib/python3.5/socket.py", line 702, in create_connection sock.connect(sa) OSError: [Errno 101] Network is unreachable

Chen

Chen Gu, Ph.D. Postdoctoral Associate Earth Resources Laboratory Department of Earth, Atmospheric and Planetary Sciences Massachusetts Institute of Technology 77 Mass Ave., 54-617 Cambridge, MA 02139 Telephone: 617-253-7278 Cell: 617-416-6058 Email: guchch@mit.edumailto:guchch@mit.edu Website: http://www.chenguatmit.com

On Jul 25, 2019, at 11:26 PM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

python

agladstein commented 4 years ago

Did you try just importing tensorflow or keras from python in the container? Did that work?

guchchcug commented 4 years ago

Hi Ariella,

That works. But I got the errors when I directly run my python codes: python MNIST-scratch.py.

Thank you!

Chen

On Jul 26, 2019, at 9:09 AM, Ariella Gladstein notifications@github.com<mailto:notifications@github.com> wrote:

Did you try just importing tensorflow or keras from python in the container? Did that work?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWJ7BTNW2IY7BZU5UDTQBLZSBA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD24RBRA#issuecomment-515444932, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWIBJ3NACIA5B52GX73QBLZSBANCNFSM4IF7KMQA.

agladstein commented 4 years ago

ah okay! In that case, the container works for what I thought you wanted. Looks like it's not able to connect to the amazon s3 bucket to get the mnist data. I'm not sure about why that's not working - would need to look at your code. Probably @julianpistorius has an idea.

julianpistorius commented 4 years ago

Figured it out. Looks like they block outbound network access from those nodes. Pre-downloading the mnist.npz into the ~/.keras/datasets directory worked.

guchchcug commented 4 years ago

Hi Julian and Ariella,

Thank you very much for helping me! :)

Chen

On Jul 26, 2019, at 1:05 PM, Julian Pistorius notifications@github.com<mailto:notifications@github.com> wrote:

Figured it out. Looks like they block outbound network access from those nodes. Pre-downloading the mnist.npz into the ~/.keras/datasets directory worked.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/cyber-carpentry/breakfast_carpentry/issues/10?email_source=notifications&email_token=ACUUFWOOIMWPSXH67GD2JQTQBMVGBA5CNFSM4IF7KMQKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25FSKI#issuecomment-515529001, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACUUFWNDO5FY2FBFUICZY23QBMVGBANCNFSM4IF7KMQA.