Closed SpheMakh closed 4 years ago
Running singularity version 2.6.0-dist on stills and it works for me Possible cause for the issue. I've seen an API change from @KshitijT's trace
Double checking with my pipeline just now
Pulling
stimela pull --force -s -im stimela/cubical:1.3.1
Works, Python 2.7:
2020-03-23 10:19:27,793 - STIMELA - INFO - STEP 1 :: de_calibrate_decal_3_0
INFO:STIMELA:STEP 1 :: de_calibrate_decal_3_0
INFO:root:Validating parameters... CAB = cubical
INFO:root:Parameters validated and saved. Parameter file is: stimela_parameter_files/de_calibrate_decal_3_0-140445341571856158495155783.json
2020-03-23 10:19:27,808 de_calibrate_decal_3_0 - Instantiating container [643]. Timeout set to -1. The container ID is printed below.
Running: singularity instance.start --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela:/scratch/stimela:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/stimela_parameter_files/de_calibrate_decal_3_0-140445341571856158495155783.json:/scratch/configfile:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/cubical/src:/scratch/code:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/singularity_run:/singularity:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/msdir:/scratch/msdir:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/input:/scratch/input:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/.currentjob.log:/scratch/logfile:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output:/scratch/output:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/tmp:/scratch/output/tmp:rw -c /home/hugo/.stimela_images/stimela_cubical_1.3.1.img 6432020-03-23 10:19:29,817 de_calibrate_decal_3_0 - Starting container [643]. Timeout set to -1. The container ID is printed below.
2020-03-23 10:19:29,818 de_calibrate_decal_3_0 - Running: singularity run instance://643
Starting virtual X frame buffer: Xvfb.
2020-03-23 10:19:30,429 de_calibrate_decal_3_0 - tch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551584951557.83 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
2020-03-23 10:19:31,432 de_calibrate_decal_3_0 - /usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py:935: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
2020-03-23 10:19:31,432 de_calibrate_decal_3_0 - warn("Attempting to work in a virtualenv. If you encounter problems, please "
2020-03-23 10:19:35,045 de_calibrate_decal_3_0 - INFO - 08:19:35 - solver [0.2/0.2 5.2/5.2 0.0Gb] set console verbosity level solver=2
2020-03-23 10:19:35,045 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] started /usr/local/bin/gocubical --sol-jones g,dd --data-ms /scratch/msdir/A1300.1GC.dbg.ms --data-column CORRECTED_DATA --data-time-chunk 32 --data-freq-chunk 0 --model-list MODEL_DATA+-/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551584951557.83 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
2020-03-23 10:19:35,045 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] output basename explicitly set to /scratch/output/J1131-19551584951557.83, --out-dir setting ignored
2020-03-23 10:19:35,045 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] Selected Options:
2020-03-23 10:19:35,046 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] [data] Visibility data options
2020-03-23 10:19:35,046 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] - ms ................................................ = /scratch/msdir/A1300.1GC.dbg.ms
2020-03-23 10:19:35,046 de_calibrate_decal_3_0 - INFO - 08:19:35 - main [0.2/0.2 5.2/5.2 0.0Gb] - column ............................................ = CORRECTED_DATA
Works Python 3.6.8 (after unrelated residual python 3 fixes to the npyscreen package):
INFO:STIMELA:STEP 1 :: de_calibrate_decal_3_0
INFO:root:Validating parameters... CAB = cubical
INFO:root:Parameters validated and saved. Parameter file is: stimela_parameter_files/de_calibrate_decal_3_0-1401676048785121584952204901566.json
Instantiating container [685]. Timeout set to -1. The container ID is printed below.
Running: singularity instance.start --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela:/scratch/stimela:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/stimela_parameter_files/de_calibrate_decal_3_0-1401676048785121584952204901566.json:/scratch/configfile:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/cubical/src:/scratch/code:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/singularity_run:/singularity:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/msdir:/scratch/msdir:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/input:/scratch/input:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/.currentjob.log:/scratch/logfile:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output:/scratch/output:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/tmp:/scratch/output/tmp:rw -c /home/hugo/.stimela_images/stimela_cubical_1.3.1.img 685
Starting container [685]. Timeout set to -1. The container ID is printed below.
Running: singularity run instance://685
Starting virtual X frame buffer: Xvfb.
tch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551584952204.9014888 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py:935: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
warn("Attempting to work in a virtualenv. If you encounter problems, please "
INFO - 08:30:17 - solver [0.2/0.2 5.2/5.2 0.0Gb] set console verbosity level solver=2
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] started /usr/local/bin/gocubical --sol-jones g,dd --data-ms /scratch/msdir/A1300.1GC.dbg.ms --data-column CORRECTED_DATA --data-time-chunk 32 --data-freq-chunk 0 --model-list MODEL_DATA+-/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551584952204.9014888 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] output basename explicitly set to /scratch/output/J1131-19551584952204.9014888, --out-dir setting ignored
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] Selected Options:
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] [data] Visibility data options
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - ms ................................................ = /scratch/msdir/A1300.1GC.dbg.ms
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - column ............................................ = CORRECTED_DATA
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - time-chunk ........................................ = 32
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - freq-chunk ........................................ = 0
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - rebin-time ........................................ = 1
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - rebin-freq ........................................ = 1
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] - chunk-by .......................................... = SCAN_NUMBER
Parameters used:
INFO - 08:30:17 - main [0.2/0.2 5.2/5.2 0.0Gb] started /usr/local/bin/gocubical --sol-jones g,dd --data-ms /scratch/msdir/A1300.1GC.dbg.ms --data-column CORRECTED_DATA --data-time-chunk 32 --data-freq-chunk 0 --model-list MODEL_DATA+-/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551584952204.9014888 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
Can't reproduce the problem above.
@bennahugo the initial post of this issue is that the container fails before cubical is invoked. I'm happy to hear that you don't have that problem, but I'm not sure we've found a solution that will work for everybody yet. Can we keep this issue open until it is clear what is causing the failure, and how all users can prevent it from happening?
Again, is it possible to get a diagnose for future reference? I am not sure I understand what the core of the issue was.
The problem is it works straight out of the box for me on master using singularity 2.6.0-dist. Can you upgrade your version of singularity because as @paoloserra says it is failing before it even starts indicating that the container is never started.
Nothing fundamentally wrong in the python as far as I can tell. Can you please read output/.current_log after failure to see if nothing is printed there? Thanks!
After much digging, I finally got the error from the cubical run:
meerkathi - 2020-03-23 21:57:24,866 CRITICAL - model-list MODEL_DATA --model-ddes auto --montblanc-dtype float --weight-column WEIGHT --madmax-enable 1 --madmax-estimate corr --madmax-plot 1 --madmax-threshold 0,10 --sol-term-iters 5,0 --bbc-save-to /scratch/output/continuum/selfcal_products/bbc-gains-1-1557766852_32chan-CGCG021_063_corr.parmdb --dist-ncpu 5 --dist-max-chunks 4 --out-name /scratch/output/continuum/selfcal_products/mypipelinerun_1557766852_32chan-CGCG021_063_corr_1_cubical --out-overwrite 1 --out-mode sc --out-casa-gaintables 1 --out-plots 1 --dist-nworker 0 --log-boring 1 --g-time-int 120 --g-freq-int 0 --g-clip-low 0.5 --g-clip-high 2.0 --g-solvable 1 --g-type phase-diag --g-save-to /scratch/output/continuum/selfcal_products/g-gains-1-1557766852_32chan-CGCG021_063_corr.parmdb --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 16 --degridding-MaxFacetSize 0.25 --degridding-MinNFacetPerAxis 1
meerkathi - 2020-03-23 21:57:25,971 CRITICAL - /usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py:935: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please install IPython inside the virtualenv.
meerkathi - 2020-03-23 21:57:25,971 CRITICAL - warn("Attempting to work in a virtualenv. If you encounter problems, please "
meerkathi - 2020-03-23 21:57:27,076 CRITICAL - Traceback (most recent call last):
meerkathi - 2020-03-23 21:57:27,077 CRITICAL - File "/usr/local/bin/gocubical", line 5, in <module>
meerkathi - 2020-03-23 21:57:27,077 CRITICAL - from cubical.main import main
meerkathi - 2020-03-23 21:57:27,077 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/cubical/main.py", line 93, in <module>
meerkathi - 2020-03-23 21:57:27,078 CRITICAL - from cubical.data_handler.ms_data_handler import MSDataHandler
meerkathi - 2020-03-23 21:57:27,078 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/cubical/data_handler/ms_data_handler.py", line 24, in <module>
meerkathi - 2020-03-23 21:57:27,078 CRITICAL - from cubical.data_handler.ms_tile import RowChunk, MSTile
meerkathi - 2020-03-23 21:57:27,078 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/cubical/data_handler/ms_tile.py", line 22, in <module>
meerkathi - 2020-03-23 21:57:27,079 CRITICAL - from .DicoSourceProvider import DicoSourceProvider
meerkathi - 2020-03-23 21:57:27,079 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/cubical/data_handler/DicoSourceProvider.py", line 3, in <module>
meerkathi - 2020-03-23 21:57:27,079 CRITICAL - from DDFacet.Imager import ModModelMachine as MMM
meerkathi - 2020-03-23 21:57:27,079 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/DDFacet/__init__.py", line 23, in <module>
meerkathi - 2020-03-23 21:57:27,080 CRITICAL - __version__ = pkg_resources.require("DDFacet")[0].version
meerkathi - 2020-03-23 21:57:27,080 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/pkg_resources/__init__.py", line 901, in require
meerkathi - 2020-03-23 21:57:27,080 CRITICAL - needed = self.resolve(parse_requirements(requirements))
meerkathi - 2020-03-23 21:57:27,080 CRITICAL - File "/usr/local/lib/python3.6/dist-packages/pkg_resources/__init__.py", line 792, in resolve
meerkathi - 2020-03-23 21:57:27,081 CRITICAL - raise VersionConflict(dist, req).with_context(dependent_req)
meerkathi - 2020-03-23 21:57:27,081 CRITICAL - pkg_resources.ContextualVersionConflict: (scipy 1.4.1 (/usr/local/lib/python3.6/dist-packages), Requirement.parse('scipy<=1.3.3'), {'DDFacet'})
meerkathi - 2020-03-23 21:57:27,081 CRITICAL - If you suspect this is an IPython 7.12.0 bug, please report it at:
meerkathi - 2020-03-23 21:57:27,081 CRITICAL - https://github.com/ipython/ipython/issues
meerkathi - 2020-03-23 21:57:27,082 CRITICAL - or send an email to the mailing list at ipython-dev@python.org
meerkathi - 2020-03-23 21:57:27,082 CRITICAL - You can print a more detailed traceback right now with "%tb", or use "%debug"
meerkathi - 2020-03-23 21:57:27,082 CRITICAL - to interactively debug it.
meerkathi - 2020-03-23 21:57:27,082 CRITICAL - Extra-detailed tracebacks for bug-reporting purposes can be enabled via:
meerkathi - 2020-03-23 21:57:27,083 CRITICAL - %config Application.verbose_crash=True
meerkathi - 2020-03-23 21:57:28,891 CRITICAL - Traceback (most recent call last):
meerkathi - 2020-03-23 21:57:28,891 CRITICAL - File "/scratch/code/run.py", line 55, in <module>
meerkathi - 2020-03-23 21:57:28,891 CRITICAL - utils.xrun(cab['binary'], parset+opts)
meerkathi - 2020-03-23 21:57:28,892 CRITICAL - File "/scratch/stimela/utils/__init__.py", line 181, in xrun
meerkathi - 2020-03-23 21:57:28,892 CRITICAL - '%s: returns errr code %d' % (command, process.returncode))
meerkathi - 2020-03-23 21:57:28,892 CRITICAL - utils.StimelaCabRuntimeError: gocubical: returns errr code 1
Looks like a broken DDF dependency
May need to teak the DDF Dockerfile a little bit
This is in the stimela master branch?
Yup, but only for singularity. Docker works just fine
Please do not modify the images. I think I know what is going on: I just tested cubical in Python2.7 and 3 with singularity and the very latest image in master as I've shown - both on stills.ru and my desktop.
Three things:
I find singularity a very poor substitute for docker. Maybe we should drop support in favour of podman
I'm running the same test again myself with the steps outlined above. Give me a few minutes.
Confirmed problem is your setup. Please check:
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
Docker image path: index.docker.io/stimela/cubical:1.3.1
Cache folder set to /home/hugo/.singularity/docker
Remove the cache folder, remove the image and remove .local. Repull the image
INFO:STIMELA:STEP 1 :: de_calibrate_decal_3_0
INFO:root:Validating parameters... CAB = cubical
INFO:root:Parameters validated and saved. Parameter file is: stimela_parameter_files/de_calibrate_decal_3_0-1397541405695201585030427110587.json
Instantiating container [85e]. Timeout set to -1. The container ID is printed below.
Running: singularity instance.start --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela:/scratch/stimela:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/stimela_parameter
_files/de_calibrate_decal_3_0-1397541405695201585030427110587.json:/scratch/configfile:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/cubical/src:/scra
tch/code:ro --bind /home/hugo/CLUSTER_SURVEY/A1300/stimela/stimela/cargo/cab/singularity_run:/singularity:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/msdir:/scratch/msdir
:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/input:/scratch/input:ro --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/.currentjob.log:/scratch/logfile:rw --bind /home/hu
go/CLUSTER_SURVEY/A1300.dbg/output:/scratch/output:rw --bind /home/hugo/CLUSTER_SURVEY/A1300.dbg/output/tmp:/scratch/output/tmp:rw -c /home/hugo/.stimela_images/stimela_cub
ical_1.3.1.img 85e
Starting container [85e]. Timeout set to -1. The container ID is printed below.
Running: singularity run instance://85e
Starting virtual X frame buffer: Xvfb.
tch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --w
eight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --madmax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term
-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-field_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-n
ame /scratch/output/J1131-19551585030427.1105072 --out-overwrite 1 --out-mode sr --out-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintab
les 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag
--g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag -
-dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 --degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --d
egridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --degridding-MinNFacetPerAxis 1
^[[B^[[B^[[B/usr/local/lib/python3.6/dist-packages/IPython/core/interactiveshell.py:935: UserWarning: Attempting to work in a virtualenv. If you encounter problems, please
install IPython inside the virtualenv.
warn("Attempting to work in a virtualenv. If you encounter problems, please "
INFO - 06:13:57 - solver [0.2/0.2 5.2/5.2 0.0Gb] set console verbosity level solver=2
INFO - 06:13:57 - main [0.2/0.2 5.2/5.2 0.0Gb] started /usr/local/bin/gocubical --sol-jones g,dd --data-ms /scratch/msdir/A1300.1GC.dbg.ms --data-column
CORRECTED_DATA --data-time-chunk 32 --data-freq-chunk 0 --model-list MODEL_DATA+-/scratch/output/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg:/scratch/out
put/J1131-1955_decal_3.DicoModel@/scratch/output/decal_3.0.dE.reg --model-ddes auto --weight-column WEIGHT --flags-apply FLAG --flags-auto-init legacy --madmax-enable 0 --m
admax-global-threshold 0,0 --madmax-threshold 0,0,10 --sol-stall-quorum 0.95 --sol-term-iters 50,90,50,90 --sol-min-bl 110.0 --sol-max-bl 0 --bbc-save-to {out[name]}-BBC-fi
eld_{sel[field]}-ddid_{sel[ddid]}.parmdb --dist-max-chunks 20 --dist-nthread 20 --out-name /scratch/output/J1131-19551585030427.1105072 --out-overwrite 1 --out-mode sr --ou
t-column DE_DATA --out-model-column MODEL_OUT --out-subtract-dirs 1: --out-casa-gaintables 0 --out-plots 1 --dist-nworker 20 --log-verbose solver=2 --g-time-int 32 --g-freq
-int 0 --g-clip-low 0 --g-clip-high 0 --g-type complex-diag --g-update-type phase-diag --g-max-prior-error 0.35 --g-max-post-error 0.35 --g-max-iter 100 --dd-dd-term 1 --dd
-time-int 32 --dd-freq-int 64 --dd-clip-low 0 --dd-clip-high 0 --dd-type complex-diag --dd-fix-dirs 0 --dd-max-prior-error 0.35 --dd-max-post-error 0.35 --dd-max-iter 200 -
-degridding-OverS 11 --degridding-Support 7 --degridding-Nw 100 --degridding-wmax 0 --degridding-Padding 1.7 --degridding-NDegridBand 15 --degridding-MaxFacetSize 0.15 --de
gridding-MinNFacetPerAxis 1
INFO - 06:13:57 - main [0.2/0.2 5.2/5.2 0.0Gb] output basename explicitly set to /scratch/output/J1131-19551585030427.1105072, --out-dir setting ignored
INFO - 06:13:57 - main [0.2/0.2 5.2/5.2 0.0Gb] Selected Options:
I am sitting on Python 3.6.8:
commit a62cfbf839b96fc97e770259b49fb925ec373e73 (HEAD -> master, origin/master, origin/HEAD)
Merge: accc93d d901c59
Author: Sphesihle Makhathini <sphemakh@gmail.com>
Date: Sun Mar 22 20:56:24 2020 +0200
Merge pull request #533 from ratt-ru/log2file
revive step file logging
Testing this now...
I used MeerKATHI master with Stimela master, crashed when executing cubical in singularity, all info below.
@SpheMakh please see the test results: System information: master_sysinfo.txt Script executed: master.sh.txt Config file: carateConfig_singularity.yml.txt Resulting log: log-meerkathi_test_carateConfig_singularity.txt
The first run using docker ran through all right.
@SpheMakh I asked you not to install dependencies into the image. I do not think it is the image itself is at fault. Have you tested removing the singularity cache? Somewhere there is is a SHA mismatch coming in because this image works correctly in Docker (I'm using that on stevie). Please do not merge this yet.
I request that we revert this, because I want to get to the root cause and not just put in a temporary kludge plate to fix this. If it is a core bug in sigularity we need to try raise the issue upstream
@bennahugo , I think this is a Stimela issue. If true, could you discuss in Stimela github, such that we have a clear separation (not that @SpheMakh would not listen at both sites)?
@gigjozsa did you follow these instructions https://github.com/ska-sa/meerkathi/issues/850#issuecomment-603038752 ? If you did it should work
@gigjozsa did you follow these instructions #850 (comment) ? If you did it should work
Ignore this, I confused this with a different issue.
The issue here is not cubical or stimela, it's that you machine uses an outdated version of singularity. If you upgrade to at least version 2.6.0-dist the error will go away because it's been fixed in that version of singularity.
The container fails before cubical is invoked, so this either a problem with the singularity installation or stimela.
OS Description: Ubuntu 16.04.5 LTS Release: 16.04 Codename: xenial
Container tech Singularity 2.4-dist