caracal-pipeline / caracal

Containerized Automated Radio Astronomy Calibration (CARACal) pipeline
GNU General Public License v2.0
28 stars 6 forks source link

Error at the end of A&P calibration of selfcal (and at the beginning of DDcal) #1528

Closed MarcoBalboni closed 1 year ago

MarcoBalboni commented 1 year ago

Hi,

I am performing amplitude and phase selfcalibration and all goes well until the very end where it reports this error on the log (and then few more follow...):

# INFO:    Converting SIF file to temporary sandbox...
# INFO:    Cleaning up image...
#   File "/stimela_mount/code/run.py", line 62, in <module>
#     subprocess.check_call(shlex.split(_runc))
#   File "/usr/lib/python2.7/subprocess.py", line 190, in check_call
#     raise CalledProcessError(retcode, cmd)
# subprocess.CalledProcessError: Command '['tigger-convert', '--append', '/stimela_mount/output/continuum/image_2/A1300_selfcal_A1300_2-pybdsm.lsm.html', '--force', '--append-type', 'auto', '--rename', '/stimela_mount/output/continuum/image_1/A1300_selfcal_A1300_1-pybdsm.lsm.html', '/stimela_mount/output/continuum/image_2/A1300_selfcal_A1300_final-pybdsm.lsm.html']' returned non-zero exit status 1
2023-07-13 00:00:37 CARACal.Stimela.create-final_lsm-1-2 ERROR: cd /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 && singularity run --userns --workdir /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 --containall returns error code 1

the final lines report

2023-07-13 00:00:37 CARACal ERROR: stimela.exceptions.PipelineException: Job 'create-final_lsm-1-2:: Combined models' failed: cd /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 && singularity run --userns --workdir /local/work/lofaruser4/A1300/.stimela_workdir-16890584258721247 --containall returns error code 1
2023-07-13 00:00:37 CARACal INFO: exiting with error code 1

However, all the images are correctly produced as the .ms files. I have also performed few cycles of phase selfcal only before and all went without errors. A very similar error occurs in the early steps when running the ddcal on the data produced by these selfcal cycles.

I also attached the full log file.

Thanks, Marco log-caracal.txt

Athanaseus commented 1 year ago

Hi @MarcoBalboni , sorry for the delayed reponse.

The problem is happening here:

restore_model:
  enable:                       True
  model:                        1+2
  clean_model:                  3

The current selected calibration mode is vis_only. If you require to combine models using this option the mode that must be selected is pybdsm_only or pybdsm_vis. The idea is that this will enable source finding in you image1 and image2 (being the one you'll be cleaning deeper) and combine the models.

In the current case, the image2 (A1300_selfcal_A1300_2 ) that you get is the results of the amplitude and phase self-calibration that you performed. So just disable restore_model and it should complete successfully.

Hope this helps

MarcoBalboni commented 1 year ago

Hi @Athanaseus thank you for the suggestion. Unfortunately, if I disable restore_model another error occurs, in particular regarding SOFIA:

stimela.exceptions.PipelineException: Job 'make-sofia_mask-field0-iter0:: Make SoFiA mask' failed: cd /local/work/lofaruser4/A1300/.stimela_workdir-16902486927050576 && singularity returns error code 1 2023-07-25 07:28:27 CARACal INFO: exiting with error code 1

What I am trying now is to re-run the whole thing specifying cal_model_mode = 'pybdsm_vis'.

Do you think it will be ok?

Thank you.

(here below the log file)

log-caracal.txt

Athanaseus commented 1 year ago

Looking at the previous logs, it appears it still needed to clean up images when the error occurred. Please check what products of SOFiA are available (Or SOFiA specific log in the logs dir). And does the error persist when you re-run?

# 
# --- SoFiA 1.3.2: Removing unreliable sources ---------------------------------
#     Elapsed time: 00:04:25.99 h
# 
# Reloading data cube for parameterisation
# Loading cube /stimela_mount/input/continuum/image_0/A1300_selfcal_A1300_0-MFS-image.fits
# The data cube has been loaded.
# 
# --- SoFiA 1.3.2: Writing mask cube -------------------------------------------
#     Elapsed time: 00:04:26.65 h
# 
# 
# --- SoFiA 1.3.2: Adding WCS position to catalogue ----------------------------
#     Elapsed time: 00:04:26.90 h
# 
# WCS coordinates added to catalogue.
# 
# --- SoFiA 1.3.2: Writing output catalogue ------------------------------------
#     Elapsed time: 00:04:29.52 h
# 
# 
# --- SoFiA 1.3.2: Pipeline finished -------------------------------------------
#     Elapsed time: 00:04:29.52 h
# 
# INFO:    Cleaning up image...

The log also indicates that you are still using vis_only mode. Yes, you can try running with pybdsm_vis also. Check the documentation here: https://caracal.readthedocs.io/en/latest/manual/workers/selfcal/index.html#cal-model-mode

MarcoBalboni commented 1 year ago

Yes, when I tried to rerun with disabled restore_model I left vis_only on purpose. However, even when I run it with pydsm_vis the latter error still occurs. Where can I find the SOFiA products? Here below attached the SOFiA specific log

log-selfcal__ap-make-sofia_mask-field0-iter0-20230725-112932.txt

Thank you for your help.

Marco

Athanaseus commented 1 year ago

Hi @MarcoBalboni The error is ambiguous and not directly coming from SOFiA, and I'm having trouble reproducing it on my end. Can you update the version of stimela and see if it helps? (pip install stimela==1.7.6) The SOFiA products (in this case, a mask) should be in the output/masking directory.

MarcoBalboni commented 1 year ago

Hi, the new error was probably related to the singularity that was badly initialized or something like that. Now I am trying to run the whole thing using the initial fix provided by you (without the restore_model). I will keep you updated.

Thank you.

Marco

MarcoBalboni commented 1 year ago

Hi @Athanaseus disabling the restore_model worked and the selfcal ended without problems and also the ddcal seems to work properly. Thank you again for your help.

Marco