ratt-ru / Stimela-classic

Containerized radio interferometry scripting framework -- NB: Classic version is no longer in active development, use stimela 2! See README for details.
GNU General Public License v2.0
28 stars 16 forks source link

CASA 4.7 cabs are failing in singularity #580

Closed bennahugo closed 4 years ago

bennahugo commented 4 years ago

latest changes break CASA 4.7 cabs

2020-04-22 20:07:16 STIMELA.calibrate_calcycle_0_Gp_0 INFO: Starting container [604]. Timeout set to -1. The container ID is printed below.
# running singularity run --workdir /stimela_mount/output --home /home/bhugo:/stimela_home  --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela_parameter_files/calibrate_calcycle_0_Gp_0-14064542293396015875786092469532.json:/stimela_mount/c
onfigfile:ro --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela/stimela/cargo/cab/casa47_gaincal/src:/stimela_mount/code:ro --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/venvstimela/bin/stimela_runscript:/singularity:ro --bind /data3/bhugo/CLUS
TER_SURVEY_TAKE2/msdir:/stimela_mount/msdir:rw --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/input:/stimela_mount/input:ro --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/output:/stimela_mount/output:rw --bind /data3/bhugo/CLUSTER_SURVEY_TAKE2/outpu
t/tmp:/stimela_mount/output/tmp:rw /home/bhugo/.stimela_images/stimela_casa_0.3.0.img /singularity
# perl: warning: Setting locale failed.
# perl: warning: Please check that your locale settings:
#       LANGUAGE = "en_ZA:en",
#       LC_ALL = (unset),
#       LANG = "en_ZA.UTF-8"
#     are supported and installed on your system.
# perl: warning: Falling back to the standard locale ("C").
# 
# =========================================
# The start-up time of CASA may vary
# depending on whether the shared libraries
# are cached or not.
# =========================================
# 
# setgpid( ) failed: Operation not permitted
#                    processes may be left dangling...
# CASA Version 4.7.0-REL (r38335)
#   Compiled on: Wed 2016/09/28 11:50:32 UTC
# Traceback (most recent call last):
#   File "/casa-release-4.7.0-el6/lib/python2.7/casapy.py", line 475, in <module>
#     from taskinit import *
#   File "/casa-release-4.7.0-el6/lib/python2.7/taskinit.py", line 1, in <module>
#     import pCASA
#   File "/casa-release-4.7.0-el6/lib/python2.7/pCASA.py", line 51, in <module>
#     import parallel_go
#   File "/casa-release-4.7.0-el6/lib/python2.7/parallel_go.py", line 2, in <module>
#     if not MPIEnvironment.is_mpi_enabled: from IPython.kernel import client
#   File "/casa-release-4.7.0-el6/lib/python2.7/site-packages/IPython/kernel/client.py", line 38, in <module>
#     from IPython.kernel.clientconnector import ClientConnector
#   File "/casa-release-4.7.0-el6/lib/python2.7/site-packages/IPython/kernel/clientconnector.py", line 22, in <module>
#     from IPython.kernel.config import config_manager as kernel_config_manager
#   File "/casa-release-4.7.0-el6/lib/python2.7/site-packages/IPython/kernel/config/__init__.py", line 28, in <module>
#     security_dir = get_security_dir()
#   File "/casa-release-4.7.0-el6/lib/python2.7/site-packages/IPython/genutils.py", line 1015, in get_security_dir
#     os.mkdir(security_dir, 0700)
# OSError: [Errno 13] Permission denied: '/root/.casa/ipython/security'
# Reloaded configuration
# Traceback (most recent call last):
#   File "/stimela_mount/code/run.py", line 10, in <module>
#     casa = drivecasa.Casapy(log2term=True, echo_to_stdout=True, timeout=24*3600*10)
#   File "/usr/local/lib/python2.7/dist-packages/drivecasa/interface.py", line 146, in __init__
#     self.child.expect(self.prompt, timeout=60)
#   File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 321, in expect
#     timeout, searchwindowsize, async)
#   File "/usr/local/lib/python2.7/dist-packages/pexpect/spawnbase.py", line 345, in expect_list
#     return exp.expect_loop(timeout)
#   File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 105, in expect_loop
#     return self.eof(e)
#   File "/usr/local/lib/python2.7/dist-packages/pexpect/expect.py", line 50, in eof
#     raise EOF(msg)
# pexpect.exceptions.EOF: End Of File (EOF). Braindead platform.
# <pexpect.pty_spawn.spawn object at 0x7f82d1005910>
# command: /casa-release-4.7.0-el6/bin/casa
# args: ['/casa-release-4.7.0-el6/bin/casa', '--nologger', '--nogui', '--colors=NoColor', '--log2term']
# buffer (last 100 chars): ''
# before (last 100 chars): "00)\r\nOSError: [Errno 13] Permission denied: '/root/.casa/ipython/security'\r\nReloaded configuration\r\n"
# after: <class 'pexpect.exceptions.EOF'>
# match: None
# match_index: None
# exitstatus: 1
# flag_eof: True
# pid: 13679
# child_fd: 5
# closed: False
# timeout: 864000
# delimiter: <class 'pexpect.exceptions.EOF'>
# logfile: None
# logfile_read: <open file '<stdout>', mode 'w' at 0x7f82d3185150>
# logfile_send: None
# maxread: 2000
# ignorecase: False
# searchwindowsize: None
# delaybeforesend: 0.05
# delayafterclose: 0.1
# delayafterterminate: 0.1
# searcher: searcher_re:
#     0: re.compile("CASA <[0-9]+>:")
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR: singularity returns error code 1
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR: job failed at 2020-04-22 20:07:19.698919 after 0:00:02.876932
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR: Traceback (most recent call last):
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:   File "/data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela/stimela/recipe.py", line 574, in run
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:     job.run_job()
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:   File "/data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela/stimela/recipe.py", line 302, in run_job
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:     self.job.run()
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:   File "/data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela/stimela/singularity.py", line 120, in run
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:     logfile=self.logfile)
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:   File "/data3/bhugo/CLUSTER_SURVEY_TAKE2/stimela/stimela/utils/xrun_poll.py", line 187, in xrun
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR:     raise StimelaCabRuntimeError("{} returns error code {}".format(command_name, status))
2020-04-22 20:07:19 STIMELA.calibrate_calcycle_0_Gp_0 ERROR: stimela.utils.StimelaCabRuntimeError: singularity returns error code 1
bennahugo commented 4 years ago

Action list

we need a more rigorous testing framework
otherwise we continue to release broken code
Need to get singularity images to test the 4.7 calibration and the 5.x calibration in a way similar to the meerkat test I just submitted for tricolour and cubical
I would say duplicate abd replace cubical with the 2 casa tasks
plus add a plotms just to check that the thing exports
https://github.com/ratt-ru/Stimela/blob/master/stimela/tests/unit_tests/test-containertech.py#L34
like that in the test setup
delete it in the teardown
bennahugo commented 4 years ago

Last known good version: https://github.com/ratt-ru/stimela.git@28457504c4a5d3c39612e13b4339a3d0b496c150

SpheMakh commented 4 years ago

Fixed