Closed oleweidner closed 10 years ago
Could you paste the contents of STDERR, STDOUT and the shell script (radicalpilot*.sh) from the compute unit '542d47b04c917a07d710e4fe'.
There is nothing with 542d47b04c917a07d710e4fe on ARCHER, neither in /work/e290/e290/ebreitmo nor in my $HOME-directory. I emailed you the output when I run extasy with export RADICAL_PILOT_VERBOSE=DEBUG.
There should be a radical.pilot.sandbox folder in /fs4/e290/e290/ebreitmo I think. Could you check please? If so, then you need to cd into the particular pilot-* folder which was created for this particular test. You will find a unit-542d47b04c917a07d710e4fe folder inside which you will find STDERR and STDOUT and the shell script.
ls -lrt /work/e290/e290/ebreitmo/pilot-542d47364c917a07d710e4f4/unit-542d47b04c917a07d710e4fe/ total 44 -rw-r--r-- 1 ebreitmo e290 523 Oct 2 13:40 postexec.py -rw-r--r-- 1 ebreitmo e290 3950 Oct 2 13:40 pycoco.py -rw-r--r-- 1 ebreitmo e290 33603 Oct 2 13:40 penta.top
more AGENT.STDERR
--------------------------------------------------------------------------------
This is a private computing facility. Access to this system is limited to those
who have been granted access by the operating service provider on behalf of the
issuing authority and use is restricted to the purposes for which access was
granted. All access and usage are governed by the terms and conditions of access
agreed to by all registered users and are thus subject to the provisions of the
Computer Misuse Act, 1990 under which unauthorised use is a criminal offence.
If you are not authorised to use this service you must disconnect immediately.
--------------------------------------------------------------------------------
Permission denied (publickey,keyboard-interactive).
kill: 18097: No such process
more AGENT.STDOUT
--------------------------------------------------------------------------------
*** ebreitmo Job: 590235.sdb started: 02/10/14 13:39:11 host: mom5 ***
*** ebreitmo Job: 590235.sdb started: 02/10/14 13:39:11 host: mom5 ***
*** ebreitmo Job: 590235.sdb started: 02/10/14 13:39:11 host: mom5 ***
*** ebreitmo Job: 590235.sdb started: 02/10/14 13:39:11 host: mom5 ***
--------------------------------------------------------------------------------
################################################################################
## Bootstrapper running on host: nid01221.
################################################################################
## Environment of bootstrapper process:
PE_LIBSCI_VOLATILE_PRGENV=CRAY GNU INTEL
PE_LIBSCI_GENCOMPS_CRAY_sandybridge=81
CRAY_BINUTILS_BIN=/opt/cray/cce/8.2.6/cray-binutils/x86_64-unknown-linux-gnu/bin
MODULE_VERSION_STACK=3.2.6.7
LESSKEY=/etc/lesskey.bin
PE_CXX_PKGCONFIG_LIBS=mpichcxx
NNTPSERVER=news
INFODIR=/usr/local/info:/usr/share/info:/usr/info
MANPATH=/work/y07/y07/cse/python/2.7.6/share/man:/opt/cray/mpt/6.3.1/gni/man/mpich2:/opt/pbs/12.1.400.132424/man:/opt/cray/atp/1.7.2/man:/opt/cray/libsci/12.2.0/man:/opt/cray/cce/8.2.6/man:/opt/cray/cce/8.2.6/craylibs/man:/opt/cray/cce/8.2.6/CC/man:/opt/cray/cce/8.2.6/cftn/man:/opt/cray/craype/2.1.1/man:/opt/cray/llm/default/man:/opt/cray/lustre-cray_ari_s/2.4_3.0.80_0.5.1_1.0501.7664.13.1-1.0501.15783.26.1/man:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/share/man:/opt/modules/3.2.6.7/man:/usr/local/man:/usr/share/man:/usr/man:/opt/cray/share/man:/opt/intel/mic/share/man
PE_TRILINOS_DEFAULT_GENCOMPS_CRAY_x86_64=82
PE_LIBSCI_DEFAULT_GENCOMPS_INTEL_sandybridge=130
GCC_X86_64=/opt/gcc/4.4.4/snos
CRAY_UDREG_INCLUDE_OPTS=-I/opt/cray/udreg/2.3.2-1.0501.7914.1.13.ari/include
HOSTNAME=mom5
PE_TRILINOS_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/trilinos/11.6.1.0/@PRGENV@/@PE_TRILINOS_DEFAULT_GENCOMPS@/@PE_TRILINOS_DEFAULT_TARGET@/lib/pkgconfig
PE_PARALLEL_NETCDF_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/parallel-netcdf/1.4.0/@PRGENV@/@PE_PARALLEL_NETCDF_DEFAULT_GENCOMPS@/lib/pkgconfig
PE_NETCDF_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/netcdf/4.3.1/@PRGENV@/@PE_NETCDF_DEFAULT_GENCOMPS@/lib/pkgconfig
RCLOCAL_BASEOPTS=true
LIBRARYMODULES=acml:alps:apprentice2:atp:cray-fftw:cray-libsci:cray-mpich2:cray-petsc:cray-petsc-complex:cray-shmem:cray-tpsl:cray-trilinos:cudatoolkit:fftw:ga:hdf5:hdf5-parallel:iobuf:lgdb:libfast:libsci_acc:mpich1:mpich2:mrnet:netcdf:netcdf-hdf5parallel:netcdf-nofsync:netcdf-nofsync-hdf5parallel:ntk:onesided:papi:parallel-netcdf:petsc:petsc-complex:pmi:shmem:tpsl:trilinos:xt-atp:xt-lgdb:xt-libsci:xt-mpt:xt-papi:/etc/opt/cray/modules/site_librarymodules
CRAY_SITE_LIST_DIR=/etc/opt/cray/modules
XKEYSYMDB=/usr/share/X11/XKeysymDB
PE_SMA_VOLATILE_PKGCONFIG_PATH=/opt/cray/mpt/6.3.1/gni/sma@PE_SMA_DIR_DEFAULT64@/lib64/pkgconfig
PE_LIBSCI_DEFAULT_GENCOMPS_GNU_interlagos=48 47
PE_ENV=CRAY
CRAY_FTN_VERSION=8.2.6
CRAY_BINUTILS_ROOT=/opt/cray/cce/8.2.6/cray-binutils
PBS_ACCOUNT=e290
PKGCONFIG_ENABLED=1
ASSEMBLER_X86_64=/opt/cray/cce/8.2.6/cray-binutils/x86_64-unknown-linux-gnu/bin/as
HOST=mom5
SHELL=/bin/bash
XTOS_VERSION=5.1.29
PE_PETSC_DEFAULT_GENCOMPS_CRAY_sandybridge=81
PE_LIBSCI_GENCOMPS_GNU_interlagos=48 47
PROFILEREAD=true
HISTSIZE=1000
PE_TRILINOS_DEFAULT_VOLATILE_PRGENV=CRAY GNU INTEL
PE_TRILINOS_DEFAULT_GENCOMPS_CRAY_interlagos=82
PE_TPSL_DEFAULT_REQUIRED_PRODUCTS=PE_MPICH:PE_LIBSCI
PE_TPSL_DEFAULT_GENCOMPS_GNU_sandybridge=48 47
PE_PARALLEL_NETCDF_DEFAULT_VOLATILE_PRGENV=GNU
PE_NETCDF_DEFAULT_VOLATILE_PRGENV=GNU
FORTRAN_SYSTEM_MODULE_NAMES=ftn_lib_definitions
CRAYPE_DIR=/opt/cray/craype/2.1.1
CRAY_XPMEM_POST_LINK_OPTS=-L/opt/cray/xpmem/0.1-2.0501.48424.3.3.ari/lib64
CRAY_UGNI_POST_LINK_OPTS=-L/opt/cray/ugni/5.0-1.0501.8253.10.22.ari/lib64
TMPDIR=/tmp/pbs.590235.sdb
PBS_JOBNAME=SAGA-Python-PBS
CRAY_MPICH2_DIR=/opt/cray/mpt/6.3.1/gni/mpich2-cray/81
PE_PETSC_DEFAULT_GENCOMPS_CRAY_interlagos=81
PE_NETCDF_HDF5PARALLEL_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/netcdf-hdf5parallel/4.3.1/@PRGENV@/@PE_NETCDF_HDF5PARALLEL_DEFAULT_GENCOMPS@/lib/pkgconfig
PE_HDF5_PARALLEL_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/hdf5-parallel/1.8.12/@PRGENV@/@PE_HDF5_PARALLEL_DEFAULT_GENCOMPS@/lib/pkgconfig
PE_HDF5_DEFAULT_VOLATILE_PRGENV=GNU
PE_FFTW_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/fftw/3.3.0.4/@PE_FFTW_DEFAULT_TARGET@/lib/pkgconfig
PE_LIBSCI_DEFAULT_GENCOMPS_GNU_x86_64=48 47
PE_GA_DEFAULT_VOLATILE_PRGENV=GNU
PE_MPICH_GENCOMPS_GNU=48 47
PE_TRILINOS_DEFAULT_GENCOMPS_INTEL_interlagos=130
PE_TPSL_DEFAULT_GENCOMPS_INTEL_x86_64=130
PE_PKGCONFIG_PRODUCTS=PE_MPICH:PE_LIBSCI
MORE=-sl
PBS_ENVIRONMENT=PBS_BATCH
PE_PETSC_DEFAULT_REQUIRED_PRODUCTS=PE_MPICH:PE_LIBSCI:PE_TPSL
PE_PAPI_DEFAULT_ACCEL_LIBS_nvidia35=-lcupti -lcudart -lcuda
QTDIR=/usr/lib/qt3
PE_CRAY_DEFAULT_FIXED_PKGCONFIG_PATH=/opt/cray/hdf5/1.8.12/CRAY/81/lib/pkgconfig:/opt/cray/ga/5.1.0.4/CRAY/81/lib/pkgconfig:/opt/cray/netcdf/4.3.1/CRAY/81/lib/pkgconfig:/opt/cray/parallel-netcdf/1.4.0/CRAY/81/lib/pkgconfig:/opt/cray/hdf5-parallel/1.8.12/CRAY/81/lib/pkgconfig:/opt/cray/netcdf-hdf5parallel/4.3.1/CRAY/81/lib/pkgconfig
PE_FORTRAN_PKGCONFIG_LIBS=mpichf90
PE_PETSC_DEFAULT_GENCOMPS_CRAY_x86_64=81
PBS_O_WORKDIR=/work/e290/e290/ebreitmo/pilot-542d47364c917a07d710e4f4
PE_TRILINOS_DEFAULT_GENCOMPS_GNU_interlagos=48 47
CRAY_PRGENVCRAY=loaded
CRAY_BINUTILS_VERSION=/opt/cray/cce/8.2.6
NCPUS=1
PE_TRILINOS_DEFAULT_GENCOMPS_INTEL_x86_64=130
PE_LIBSCI_GENCOMPS_CRAY_interlagos=81
NODE_COUNT=1
PE_SMA_DIR_CRAY_DEFAULT64=64
BUILD_OPTS=/opt/cray/craype/2.1.1/bin/build-opts
JRE_HOME=/usr/lib64/jvm/jre
PBS_TASKNUM=1
USER=ebreitmo
LD_LIBRARY_PATH=/work/y07/y07/cse/python/2.7.6/lib
PE_TPSL_DEFAULT_GENCOMPS_CRAY_x86_64=81
PE_LIBSCI_DEFAULT_VOLATILE_PRGENV=CRAY GNU INTEL
PE_FFTW_DEFAULT_TARGET_interlagos=interlagos
LS_COLORS=
PE_MPICH_FIXED_PRGENV=INTEL
PE_PKGCONFIG_LIBS=mpich:AtpSigHandler:sci_mpi_mp:sci_mp
PE_PETSC_DEFAULT_VOLATILE_PRGENV=CRAY GNU INTEL
CRAY_RCA_POST_LINK_OPTS=-L/opt/cray/rca/1.0.0-2.0501.48090.7.46.ari/lib64 -lrca
PBS_O_HOME=/home/e290/e290/ebreitmo
PE_PETSC_DEFAULT_GENCOMPS_INTEL_sandybridge=130
PE_PETSC_DEFAULT_GENCOMPS_INTEL_interlagos=130
PE_PETSC_DEFAULT_GENCOMPS_GNU_sandybridge=48 47
PE_PETSC_DEFAULT_GENCOMPS_GNU_interlagos=48 47
FTN_X86_64=/opt/cray/cce/8.2.6/cftn/x86-64
XNLSPATH=/usr/share/X11/nls
MPICH_DIR=/opt/cray/mpt/6.3.1/gni/mpich2-cray/81
PE_PAPI_DEFAULT_PKGCONFIG_VARIABLES=PE_PAPI_ACCEL_LIBS_@accelerator@
PE_LIBSCI_DEFAULT_GENCOMPS_CRAY_x86_64=81
MPICH_ABORT_ON_ERROR=1
ENV=/etc/bash.bashrc
NUM_PES=24
PE_FFTW_DEFAULT_TARGET_sandybridge=sandybridge
PE_FFTW_DEFAULT_REQUIRED_PRODUCTS=PE_MPICH
ATP_POST_LINK_OPTS=-Wl,-L/opt/cray/atp/1.7.2/lib/
HOSTTYPE=x86_64
RCLOCAL_PRGENV=true
PE_TRILINOS_DEFAULT_GENCOMPS_CRAY_sandybridge=82
PE_TPSL_DEFAULT_GENCOMPS_GNU_interlagos=48 47
PE_LIBSCI_GENCOMPS_INTEL_x86_64=130
PE_PRODUCT_LIST=CRAYPE_IVYBRIDGE:CRAY_RCA:CRAY_PMI:CRAY_LIBSCI:CRAYPE:CRAY:CRAY_LLM:CRAY_XPMEM:CRAY_DMAPP:CRAY_UGNI:CRAY_UDREG:CRAY_ALPS
FROM_HEADER=
PBS_MOMPORT=15003
PE_LIBSCI_GENCOMPS_GNU_sandybridge=48 47
FFTW_SYSTEM_WISDOM_DIR=/opt/cray/libsci/12.2.0
PAGER=less
CSHEDIT=emacs
PE_TPSL_DEFAULT_GENCOMPS_CRAY_sandybridge=81
PE_MPICH_TARGET_VAR_nvidia20=-lcudart
PE_MPICH_DEFAULT_VOLATILE_PRGENV=CRAY GNU
PE_LIBSCI_GENCOMPS_CRAY_x86_64=81
XDG_CONFIG_DIRS=/etc/xdg
PBS_O_QUEUE=standard
NUM_PPN=24
PE_PARALLEL_NETCDF_DEFAULT_GENCOMPS_GNU=48 47
PE_NETCDF_DEFAULT_GENCOMPS_GNU=48 47
PE_LIBSCI_PKGCONFIG_LIBS=sci_mpi_mp:sci_mp
NLSPATH=/opt/cray/cce/8.2.6/CC/x86-64/nls/En/%N.cat:/opt/cray/cce/8.2.6/craylibs/x86-64/nls/En/%N.cat:/opt/cray/cce/8.2.6/cftn/x86-64/nls/En/%N.cat
CRAY_LIBSCI_DIR=/opt/cray/libsci/12.2.0
CRAY_LIBSCI_BASE_DIR=/opt/cray/libsci/12.2.0
MAPP_INCLUDE_OPTS=-I/opt/cray/dmapp/7.0.1-1.0501.8315.8.4.ari/include -I/opt/cray/gni-headers/3.0-1.0501.8317.12.1.ari/include
USERMODULES=acml:alps:apprentice2:atp:blcr:cce:chapel:cray-fftw:cray-libsci:cray-mpich2:craypat:craype:cray-petsc:cray-petsc-complex:cray-shmem:cray-tpsl:cray-trilinos:cudatoolkit:ddt:fftw:ga:gcc:hdf5:hdf5-parallel:intel:iobuf:java:lgdb:libfast:libsci_acc:mpich1:mrnet:netcdf:netcdf-hdf5parallel:netcdf-nofsync:netcdf-nofsync-hdf5parallel:ntk:onesided:papi:parallel-netcdf:pathscale:perftools:petsc:petsc-complex:pgi:pmi:PrgEnv-cray:PrgEnv-gnu:PrgEnv-intel:PrgEnv-pathscale:PrgEnv-pgi:stat:totalview:tpsl:trilinos:xt-asyncpe:xt-craypat:xt-lgdb:xt-libsci:xt-mpich2:xt-mpt:xt-papi:xt-shmem:xt-totalview:/etc/opt/cray/modules/site_usermodules
MINICOM=-c on
PE_TPSL_DEFAULT_GENCOMPS_CRAY_interlagos=81
PE_PKGCONFIG_DEFAULT_PRODUCTS=PE_HDF5:PE_TPSL:PE_GA:PE_NETCDF:PE_PARALLEL_NETCDF:PE_TRILINOS:PE_HDF5_PARALLEL:PE_NETCDF_HDF5PARALLEL:PE_FFTW:PE_LIBSCI:PE_MPICH:PE_PETSC
PE_HDF5_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/hdf5/1.8.12/@PRGENV@/@PE_HDF5_DEFAULT_GENCOMPS@/lib/pkgconfig
MODULE_VERSION=3.2.6.7
MAIL=/var/spool/mail/ebreitmo
PBS_O_LOGNAME=ebreitmo
PATH=/work/y07/y07/cse/python/2.7.6/bin:/opt/pbs/12.1.400.132424/bin:/opt/cray/atp/1.7.2/bin:/opt/cray/rca/1.0.0-2.0501.48090.7.46.ari/bin:/opt/cray/pmi/5.0.3-1.0000.9981.128.2.ari/bin:/opt/cray/cce/8.2.6/cray-binutils/x86_64-unknown-linux-gnu/bin:/opt/cray/cce/8.2.6/craylibs/x86-64/bin:/opt/cray/cce/8.2.6/cftn/bin:/opt/cray/cce/8.2.6/CC/bin:/opt/cray/craype/2.1.1/bin:/opt/cray/llm/default/bin:/opt/cray/llm/default/etc:/opt/cray/xpmem/0.1-2.0501.48424.3.3.ari/bin:/opt/cray/dmapp/7.0.1-1.0501.8315.8.4.ari/bin:/opt/cray/ugni/5.0-1.0501.8253.10.22.ari/bin:/opt/cray/udreg/2.3.2-1.0501.7914.1.13.ari/bin:/opt/cray/lustre-cray_ari_s/2.4_3.0.80_0.5.1_1.0501.7664.13.1-1.0501.15783.26.1/sbin:/opt/cray/lustre-cray_ari_s/2.4_3.0.80_0.5.1_1.0501.7664.13.1-1.0501.15783.26.1/bin:/opt/cray/MySQL/5.0.64-1.0000.7096.23.2/sbin:/opt/cray/MySQL/5.0.64-1.0000.7096.23.2/bin:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/sbin:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/bin:/opt/cray/sdb/1.0-1.0501.48084.4.48.ari/bin:/opt/cray/nodestat/2.2-1.0501.47138.1.78.ari/bin:/opt/modules/3.2.6.7/bin:/home/e290/e290/ebreitmo/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib64/jvm/jre/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/usr/lib/qt3/bin:/opt/cray/bin
PE_LIBSCI_DEFAULT_GENCOMPS_CRAY_sandybridge=81
XTPE_NETWORK_TARGET=aries
CPU=x86_64
PBS_O_LANG=en_GB.UTF-8
PE_NETCDF_HDF5PARALLEL_DEFAULT_GENCOMPS_GNU=48 47
PE_NETCDF_HDF5PARALLEL_DEFAULT_FIXED_PRGENV=CRAY INTEL
PE_HDF5_PARALLEL_DEFAULT_GENCOMPS_GNU=48 47
PE_HDF5_PARALLEL_DEFAULT_FIXED_PRGENV=CRAY INTEL
CRAY_PE_TARGET=x86-64
JAVA_BINDIR=/usr/lib64/jvm/jre/bin
PBS_JOBCOOKIE=00000000011F59F4000000000B0A874C
PE_TPSL_DEFAULT_GENCOMPS_INTEL_interlagos=130
CRAY_UDREG_POST_LINK_OPTS=-L/opt/cray/udreg/2.3.2-1.0501.7914.1.13.ari/lib64
PE_MPICH_VOLATILE_PRGENV=CRAY GNU
PE_TRILINOS_DEFAULT_GENCOMPS_GNU_sandybridge=48 47
CRAYPE_VERSION=2.1.1
CRAY_ALPS_POST_LINK_OPTS=-L/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/lib64
INPUTRC=/etc/inputrc
PWD=/work/e290/e290/ebreitmo/pilot-542d47364c917a07d710e4f4
PE_MPICH_DEFAULT_GENCOMPS_CRAY=81
INCLUDE_PATH_X86_64=/opt/cray/cce/8.2.6/craylibs/x86-64/include
_LMFILES_=/opt/modulefiles/modules/3.2.6.7:/opt/cray/ari/modulefiles/nodestat/2.2-1.0501.47138.1.78.ari:/opt/cray/ari/modulefiles/sdb/1.0-1.0501.48084.4.48.ari:/opt/cray/ari/modulefiles/alps/5.1.1-2.0501.8507.1.1.ari:/opt/cray/modulefiles/MySQL/5.0.64-1.0000.7096.23.2:/opt/cray/modulefiles/lustre-cray_ari_s/2.4_3.0.80_0.5.1_1.0501.7664.13.1-1.0501.15783.26.1:/opt/cray/ari/modulefiles/udreg/2.3.2-1.0501.7914.1.13.ari:/opt/cray/ari/modulefiles/ugni/5.0-1.0501.8253.10.22.ari:/opt/cray/ari/modulefiles/gni-headers/3.0-1.0501.8317.12.1.ari:/opt/cray/ari/modulefiles/dmapp/7.0.1-1.0501.8315.8.4.ari:/opt/cray/ari/modulefiles/xpmem/0.1-2.0501.48424.3.3.ari:/opt/modulefiles/hss-llm/7.1.0:/opt/modulefiles/Base-opts/1.0.2-1.0501.47945.4.2.ari:/opt/cray/craype/default/modulefiles/craype-network-aries:/opt/cray/moduefiles/Base-opts/1.0.2-1.0501.47945.4.2.ari:/opt/cray/craype/default/modulefiles/craype-network-aries:/opt/cray/modulefiles/craype/2.1.1:/opt/modulefiles/cce/8.2.6:/opt/cray/modulefiles/cray-libsci/12.2.0:/opt/cray/ari/modulefiles/pmi/5.0.3-1.0000.9981.128.2.ari:/opt/cray/ari/modulefiles/rca/1.0.0-2.0501.48090.7.46.ari:/opt/cray/modulefiles/atp/1.7.2:/opt/cray/modulefiles/PrgEnv-cray/5.1.29:/opt/modulefiles/pbs/12.1.400.132424:/opt/cray/craype/default/modulefiles/craype-ivybridge:/opt/cray/modulefiles/cray-mpich/6.3.1:/opt/modulefiles/packages-archer:/opt/modules/packages-archer/python/2.7.6:/opt/modules/packages-archer/cse-compute-defaults/1.0
TARGETMODULES=craype-abudhabi:craype-abudhabi-cu:craype-accel-nvidia20:craype-accel-nvidia30:craype-accel-nvidia35:craype-barcelona:craype-hugepages128K:craype-hugepages128M:craype-hugepages16M:craype-hugepages256M:craype-hugepages2M:craype-hugepages512K:craype-hugepages512M:craype-hugepages64M:craype-hugepages8M:craype-interlagos:craype-interlagos-cu:craype-istanbul:craype-ivybridge:craype-knc:craype-mc12:craype-mc8:craype-network-aries:craype-network-gemini:craype-network-seastar:craype-sandybridge:craype-shanghai:craype-target-compute_node:craype-target-local_host:craype-target-native:craype-target-petest:craype-xeon:xtpe-barcelona:xtpe-interlagos:xtpe-interlagos-cu:xtpe-istanbul:xtpe-mc12:xtpe-mc8:xtpe-network-gemini:xtpe-network-seastar:xtpe-shanghai:xtpe-target-native:xtpe-xeon:/etc/opt/cray/modules/site_targetmodules
JAVA_HOME=/usr/lib64/jvm/jre
PE_LIBSCI_GENCOMPS_INTEL_interlagos=130
PE_LIBSCI_DEFAULT_GENCOMPS_GNU_sandybridge=48 47
PE_INTEL_FIXED_PKGCONFIG_PATH=/opt/cray/mpt/6.3.1/gni/mpich2-intel/130/lib/pkgconfig
LANG=en_US.UTF-8
PBS_NODENUM=0
PE_MPICH_VOLATILE_PKGCONFIG_PATH=/opt/cray/mpt/6.3.1/gni/mpich2-@PRGENV@@PE_MPICH_DIR_DEFAULT64@/@PE_MPICH_GENCOMPS@/lib/pkgconfig
PE_MPICH_NV_LIBS_nvidia20=-lcudart
PE_LIBSCI_DEFAULT_GENCOMPS_INTEL_interlagos=130
MODULEPATH=/opt/cray/craype/default/modulefiles:/opt/cray/ari/modulefiles:/opt/cray/modulefiles:/opt/modulefiles:/opt/modules/packages-archer
PYTHONSTARTUP=/etc/pythonstart
SHMEM_ABORT_ON_ERROR=1
LOADEDMODULES=modules/3.2.6.7:nodestat/2.2-1.0501.47138.1.78.ari:sdb/1.0-1.0501.48084.4.48.ari:alps/5.1.1-2.0501.8507.1.1.ari:MySQL/5.0.64-1.0000.7096.23.2:lustre-cray_ari_s/2.4_3.0.80_0.5.1_1.0501.7664.13.1-1.0501.15783.26.1:udreg/2.3.2-1.0501.7914.1.13.ari:ugni/5.0-1.0501.8253.10.22.ari:gni-headers/3.0-1.0501.8317.12.1.ari:dmapp/7.0.1-1.0501.8315.8.4.ari:xpmem/0.1-2.0501.48424.3.3.ari:hss-llm/7.1.0:Base-opts/1.0.2-1.0501.47945.4.2.ari:craype-network-aries:craype/2.1.1:cce/8.2.6:cray-libsci/12.2.0:pmi/5.0.3-1.0000.9981.128.2.ari:rca/1.0.0-2.0501.48090.7.46.ari:atp/1.7.2:PrgEnv-cray/5.1.29:pbs/12.1.400.132424:craype-ivybridge:cray-mpich/6.3.1:packages-archer:python/2.7.6:cse-compute-defaults/1.0
PBS_JOBDIR=/home/e290/e290/ebreitmo
NUM_DEPTH=1
TZ=Europe/London
CRAY_DMAPP_POST_LINK_OPTS=-L/opt/cray/dmapp/7.0.1-1.0501.8315.8.4.ari/lib64
PE_SMA_DIR_PGI_DEFAULT64=64
CRAY_RCA_INCLUDE_OPTS=-I/opt/cray/rca/1.0.0-2.0501.48090.7.46.ari/include -I/opt/cray-hss-devel/7.1.0/include -I/opt/cray/krca/1.0.0-2.0501.47640.3.70.ari/include
PBS_O_SHELL=/bin/bash
PE_MPICH_PKGCONFIG_VARIABLES=PE_MPICH_NV_LIBS_@accelerator@
PE_LIBSCI_DEFAULT_GENCOMPS_INTEL_x86_64=130
PBS_JOBID=590235.sdb
CRAY_MPICH2_VER=6.3.1
PE_TPSL_DEFAULT_VOLATILE_PKGCONFIG_PATH=/opt/cray/tpsl/1.4.0/@PRGENV@/@PE_TPSL_DEFAULT_GENCOMPS@/@PE_TPSL_DEFAULT_TARGET@/lib/pkgconfig
PE_HDF5_DEFAULT_FIXED_PRGENV=CRAY INTEL
CRAY_PMI_POST_LINK_OPTS=-L/opt/cray/pmi/5.0.3-1.0000.9981.128.2.ari/lib64
CRAY_CC_VERSION=8.2.6
ENVIRONMENT=BATCH
PYTHON_INCLUDE_OPTS=-I /work/y07/y07/cse/python/2.7.6/include
PE_PARALLEL_NETCDF_DEFAULT_FIXED_PRGENV=CRAY INTEL
...
CRAYOS_VERSION=5.1.29
CC_X86_64=/opt/cray/cce/8.2.6/CC/x86-64
CRAY_LD_LIBRARY_PATH=/opt/cray/mpt/6.3.1/gni/mpich2-cray/81/lib:/opt/cray/rca/1.0.0-2.0501.48090.7.46.ari/lib64:/opt/cray/pmi/5.0.3-1.0000.9981.128.2.ari/lib64:/opt/cray/libsci/12.2.0/CRAY/81/x86_64/lib:/opt/cray/cce/8.2.6/CC/x86-64/lib/x86-64:/opt/cray/cce/8.2.6/craylibs/x86-64:/opt/cray/xpmem/0.1-2.0501.48424.3.3.ari/lib64:/opt/cray/dmapp/7.0.1-1.0501.8315.8.4.ari/lib64:/opt/cray/ugni/5.0-1.0501.8253.10.22.ari/lib64:/opt/cray/udreg/2.3.2-1.0501.7914.1.13.ari/lib64:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/lib64
G_BROKEN_FILENAMES=1
PBS_NODEFILE=/var/spool/PBS/aux/590235.sdb
PE_PETSC_DEFAULT_GENCOMPS_INTEL_x86_64=130
PE_PETSC_DEFAULT_GENCOMPS_GNU_x86_64=48 47
PE_MPICH_DEFAULT_DIR_CRAY_DEFAULT64=64
JAVA_ROOT=/usr/lib64/jvm/jre
COLORTERM=1
PBS_O_PATH=/usr/local/packages/cse/imagemagick/6.8.8-2/bin:/home/y07/y07/cse/nano/2.2.6/bin:/home/y07/y07/cse/tkdiff/4.2:/work/y07/y07/cse/python/2.7.6/bin:/usr/local/packages/cse/serialJobs:/usr/local/packages/cse/bolt/bin:/usr/local/packages/cse/checkDisk:/usr/local/packages/cse/checkQueue:/usr/local/packages/cse/checkScript:/usr/local/packages/cse/budgets:/opt/pbs/12.1.400.132424/bin:/opt/cray/atp/1.7.2/bin:/opt/cray/rca/1.0.0-2.0501.48090.7.46.ari/bin:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/sbin:/opt/cray/alps/5.1.1-2.0501.8507.1.1.ari/bin:/opt/cray/dvs/2.4_0.9.0-1.0501.1672.2.122.ari/bin:/opt/cray/csa/3.0.0-1_2.0501.47112.1.91.ari/sbin:/opt/cray/csa/3.0.0-1_2.0501.47112.1.91.ari/bin:/opt/cray/job/1.5.5-0.1_2.0501.48066.2.43.ari/bin:/opt/cray/xpmem/0.1-2.0501.48424.3.3.ari/bin:/opt/cray/dmapp/7.0.1-1.0501.8315.8.4.ari/bin:/opt/cray/pmi/5.0.3-1.0000.9981.128.2.ari/bin:/opt/cray/ugni/5.0-1.0501.8253.10.22.ari/bin:/opt/cray/udreg/2.3.2-1.0501.7914.1.13.ari/bin:/opt/cray/cce/8.2.6/cray-binutils/x86_64-unknown-linux-gnu/bin:/opt/cray/cce/8.2.6/craylibs/x86-64/bin:/opt/cray/cce/8.2.6/cftn/bin:/opt/cray/cce/8.2.6/CC/bin:/opt/cray/craype/2.1.1/bin:/opt/cray/switch/1.0-1.0501.47124.1.93.ari/bin:/opt/cray/eslogin/eswrap/1.1.0-1.010400.915.0/bin:/opt/modules/3.2.6.7/bin:/home/e290/e290/ebreitmo/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib64/jvm/jre/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/sbin:/usr/sbin:.:/usr/lib/qt3/bin:/opt/cray/bin
_=/usr/bin/printenv
################################################################################
## Running pre-bootstrapping command
## CMDLINE: module load python
################################################################################
## Setting up forward tunnel for MongoDB to 10.60.0.52.
################################################################################
## Searching for available TCP port for tunnel in range 23000..23100.
## Found available port: 23000
################################################################################
## Launching radical-pilot-agent for 24 cores.
## CMDLINE: python radical-pilot-agent.py -b 0 -c 24 -d 50 -j APRUN -k APRUN -l PBSPRO -m 127.0.0.1:23000 -n radicalpilot -p 542d47364c917a07d710e4f4 -s 542d47334c917a07d710e4f2 -t 60 -v 0.20
CLEANUP:
--------------------------------------------------------------------------------
Resources requested: ncpus=24,place=free,walltime=01:00:00
Resources allocated: cpupercent=0,cput=00:00:01,mem=14964kb,ncpus=24,vmem=105956kb,walltime=00:00:15
*** ebreitmo Job: 590235.sdb ended: 02/10/14 13:39:25 queue: standard ***
*** ebreitmo Job: 590235.sdb ended: 02/10/14 13:39:25 queue: standard ***
*** ebreitmo Job: 590235.sdb ended: 02/10/14 13:39:25 queue: standard ***
*** ebreitmo Job: 590235.sdb ended: 02/10/14 13:39:25 queue: standard ***
--------------------------------------------------------------------------------
I updated my ssh keys.
Running coco/lsdmap from my mac on ARCHER gives errors like
2014:10:06 12:27:43 radical.pilot.MainProcess: [ERROR ] Output transfer failed: unexpected EOF (6378011b0/unit-5432760b4c917a06378011b5/mdshort.in
//Users/elenabreitmoser/coam-on-archer/mdshor 0% 0 0.0KB/s --:-- ETA//Users/elenabreitmoser/coam-on-archer/mdshor 100% 216 0.2KB/s 00:00
sftp> sftp> Couldn't send packet: Broken pipe
) (/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/utils/pty_process.py +565 (read) : raise se.NoSuccess ("unexpected EOF (%s)" % self.tail))
Traceback (most recent call last):
File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/radical/pilot/controller/output_file_transfer_worker.py", line 156, in run
output_file.copy(saga.Url(abs_target), flags=copy_flags)
File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/namespace/entry.py", line 276, in copy
ret = self._adaptor.copy_self (tgt_url, flags, ttype=ttype)
File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/adaptors/cpi/decorators.py", line 51, in wrap_function
return sync_function (self, _args, *_kwargs)
File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/adaptors/shell/shell_file.py", line 1126, in copy_self
files_copied = self.shell.obj.stage_from_remote (src.path, tgt.path, rec_flag)
File "/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/utils/pty_shell.py", line 878, in stage_from_remote
raise ptye.translate_exception (e)
NoSuccess: unexpected EOF (6378011b0/unit-5432760b4c917a06378011b5/mdshort.in
//Users/elenabreitmoser/coam-on-archer/mdshor 0% 0 0.0KB/s --:-- ETA//Users/elenabreitmoser/coam-on-archer/mdshor 100% 216 0.2KB/s 00:00
sftp> sftp> Couldn't send packet: Broken pipe
) (/Users/elenabreitmoser/020114/lib/python2.7/site-packages/saga/utils/pty_process.py +565 (read) : raise se.NoSuccess ("unexpected EOF (%s)" % self.tail))
//Users/elenabreitmoser/coam-on-archer/mdshor 0% 0 0.0KB/s --:-- ETA//Users/elenabreitmoser/coam-on-archer/mdshor 100% 216 0.2KB/s 00:00
sftp> sftp> Couldn't send packet: Broken pipe
Uh, that's an error on a layer beneath radical.pilot and saga, the ssh/sfto connection broke. Hmmm. I'll try to reproduce this on archer, but right now have no idea what could cause this. Is the problem repeatable? If so, would you mind running under export SAGA_VERBOSE=DEBUG
again, and putting the (longish) output somewhere, like in a gist?
Thanks!
Hello,
Please find the output in: https://gist.github.com/ebreitmo/d469e54b73a7ac9a076a
I set up my ssh connection from my Mac(A) to ARCHER(B) like this:
a@A:~> ssh-keygen -t rsa a@A:~> ssh b@B mkdir -p .ssh b@B's password: a@A:~> cat .ssh/id_rsa.pub | ssh b@B 'cat >> .ssh/authorized_keys' b@B's password:
Thanks, Elena
Dr Elena Breitmoser
EPCC, University of Edinburgh JCMB, Room 3401 Peter Guthrie Tait Road UK-Edinburgh EH9 3FD
Tel: +44 131 650 6494
On 6 Oct 2014, at 12:45, Andre Merzky notifications@github.com wrote:
//Users/elenabreitmoser/coam-on-archer/mdshor 0% 0 0.0KB/s --:-- ETA//Users/elenabreitmoser/coam-on-archer/mdshor 100% 216 0.2KB/s 00:00 sftp> sftp> Couldn't send packet: Broken pipe Uh, that's an error on a layer beneath radical.pilot and saga, the ssh/sfto connection broke. Hmmm. I'll try to reproduce this on archer, but right now have no idea what could cause this. Is the problem repeatable? If so, would you mind running under export SAGA_VERBOSE=DEBUG again, and putting the (longish) output somewhere, like in a gist?
Thanks!
— Reply to this email directly or view it on GitHub.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Hi Elena, can you please check whether the sftp error above was a temporary glitch or if you get this consistently?
I ran it again today. Nothing showed up in red when I used the RADICAL_PILOT_VERBOSE, SAGA_VERBOSE=DEBUG flags. The output printed for a run without DEBUG-flags is below and 'ls' shows the following new files locally:
ls -lrt total 2568
-rw-r--r-- 1 elenabreitmoser staff 216 8 Oct 10:29 mdshort.in -rw-r--r-- 1 elenabreitmoser staff 225 8 Oct 10:30 min.in -rw-r--r-- 1 elenabreitmoser staff 2165 8 Oct 10:30 penta.crd -rw-r--r-- 1 elenabreitmoser staff 33603 8 Oct 10:30 penta.top -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:43 md_0_1.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:44 md_0_3.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:44 md_0_2.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:44 md_0_5.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:45 md_0_4.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:45 md_0_0.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:45 md_0_6.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:45 md_0_7.ncdf -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min12.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min11.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min10.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min15.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min14.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min13.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min17.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:47 min16.crd -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:48 md_1_3.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:48 md_1_0.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:48 md_1_2.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:49 md_1_4.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:49 md_1_1.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:49 md_1_7.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:49 md_1_6.ncdf -rw------- 1 elenabreitmoser staff 71680 8 Oct 10:50 md_1_5.ncdf -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min22.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min21.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min20.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min25.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min24.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min23.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min27.crd -rw------- 1 elenabreitmoser staff 2174 8 Oct 10:51 min26.crd
extasy --RPconfig archer.rcfg --Kconfig cocoamber.wcfg Session UID: 543504454c917a02e9ccb4ae Pilot UID : 543504484c917a02e9ccb4b0 Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/amber.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/coco.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/gromacs.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/lsdmap.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/mmpbsa.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/namd.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/sleep.json Loading kernel configurations from /Users/elenabreitmoser/081014/lib/python2.7/site-packages/radical/ensemblemd/mdkernels/configs/test.json Cycle 0 Starting Simulation ... Total Simulation Time : 211.780575037 Simulation Execution Time : 134.262 Starting Analysis Submitting COCO Compute Unit [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to PendingInputStaging. [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to StagingInput. [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to Scheduling. [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to Executing. [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to StagingOutput. [Callback]: ComputeUnit '5435051c4c917a02e9ccb4ba' state changed to Done. Analysis Execution time : 74.131 Cycle 1 Starting Simulation ... Total Simulation Time : 140.431694031 Simulation Execution Time : 130.825 Starting Analysis Submitting COCO Compute Unit [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to PendingInputStaging. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to StagingInput. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to PendingExecution. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to Scheduling. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to Executing. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to StagingOutput. [Callback]: ComputeUnit '543506044c917a02e9ccb4c3' state changed to Done. Analysis Execution time : 41.965 [Callback]: ComputePilot '543504484c917a02e9ccb4b0' state changed to Canceled.
I assume this implies Coco/Amber work now and this issue can be closed?!
Yes, this looks all good! :)
From Elena: