Open keflavich opened 2 years ago
Pipeline crashed with a large core dump:
2022-10-23 07:44:32 INFO: Executing tclean(vis=['uid___A002_Xfe90b7_X2ab_target.ms', 'uid___A002_Xfe90b7_X7016_target.ms'], field='Sgr_A_star', spw=['25:85.9649139075~85.9675994544GHz;85.9697967200~85.9715057044GHz;85.9788299231~85.98127132
94GHz;86.0105682044~86.0117889075GHz;86.0335174231~86.0342498450GHz;86.0850310950~86.0864959387GHz;86.1323943762~86.1338592200GHz;86.1348357825~86.1389861731GHz;86.1526580481~86.1543670325GHz;86.1773162512~86.1780486731GHz;86.2808318762~86.
2832732825GHz;86.3865447669~86.3877654700GHz;86.3943572669~86.3963103919GHz;86.4107146887~86.4143767981GHz', '25:85.9649233102~85.9676088570GHz;85.9698061227~85.9715151070GHz;85.9788393258~85.9812807320GHz;86.0105776070~86.0117983102GHz;86.
0335268258~86.0342592477GHz;86.0850404977~86.0865053414GHz;86.1324037789~86.1338686227GHz;86.1348451852~86.1389955758GHz;86.1526674508~86.1543764352GHz;86.1773256539~86.1780580758GHz;86.2808412789~86.2832826852GHz;86.3865541695~86.387774872
7GHz;86.3943666695~86.3963197945GHz;86.4107240914~86.4143862008GHz'], antenna=['0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42&', '0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43&'], scan=['6,9,12,15,18', '6,9,12,15,18'], intent='OBSERVE_TARGET#ON_SOURCE', datacolumn='data', imagename='uid___A001_X15a0_Xac.s8_0.Sgr_A_star_sci.spw25.
mfs.I.iter0', imsize=[1600, 1536], cell=['0.28arcsec'], phasecenter='ICRS 17:46:22.6098 -028.42.03.206', stokes='I', specmode='mfs', nchan=-1, outframe='LSRK', perchanweightdensity=False, gridder='mosaic', mosweight=True, usepointing=False,
pblimit=0.2, deconvolver='hogbom', restoration=False, restoringbeam='common', pbcor=False, weighting='briggs', robust=0.0, npixels=0, niter=0, threshold='0.0mJy', nsigma=0.0, interactive=0, usemask='auto-multithresh', sidelobethreshold=2.0
, noisethreshold=4.25, lownoisethreshold=1.5, negativethreshold=0.0, minbeamfrac=0.3, growiterations=75, dogrowprune=True, minpercentchange=1.0, fastnoise=False, savemodel='none', parallel=True)
*** Error in `/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3': double free or corruption (!prev): 0x0000000023030f20 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x816b9)[0x2aed22a866b9]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casatools/__casac__/lib/libcasatools.cpython-36m-x86_64-linux-gnu.so(_ZN9casa6core9LogOriginD1Ev+0x144)[0x2aed3e543dd4]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casatools/__casac__/lib/libcasatools.cpython-36m-x86_64-linux-gnu.so(_ZN9casa6core10LogMessageD1Ev+0x11)[0x2aed3e53a121]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casatools/__casac__/_logsink.cpython-36m-x86_64-linux-gnu.so(_ZN5casac7logsink11postLocallyERKSsS2_S2_+0x12c)[0x2aed4c911dac]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casatools/__casac__/_logsink.cpython-36m-x86_64-linux-gnu.so(+0x1344d)[0x2aed4c91a44d]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3(PyCFunction_Call+0xc6)[0x4f35e6]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3(_PyEval_EvalFrameDefault+0x59df)[0x54a17f]
/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3[0x543cb7]
...
[c0712a-s28:48137] [26] /blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3[0x543cb7]
[c0712a-s28:48137] [27] /blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3[0x54d6eb]
[c0712a-s28:48137] [28] /blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3[0x54c61c]
[c0712a-s28:48137] [29] /blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/bin/python3(_PyEval_EvalFrameDefault+0x2ca)[0x544a6a]
[c0712a-s28:48137] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 48137 on node c0712a-s28 exited on signal 6 (Aborted).
--------------------------------------------------------------------------
Overall this looks pretty good.
Assuming the PL can run without crashing, then this should be an easy pass once SPW 33 has been imaged.
The script generator for this one is looking for MSes that don't exist:
Sgr_A_st_c_03_TM1 tclean_cont_pars aggregate c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs.ms', 'uid___A002_Xfe90b7_X7016s.ms']
Sgr_A_st_c_03_TM1 tclean_cont_pars aggregate_high c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs.ms', 'uid___A002_Xfe90b7_X7016s.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw25 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw27 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw29 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw31 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw35 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
Sgr_A_st_c_03_TM1 tclean_cube_pars spw33 c TM1: ERROR: Files not found: ['uid___A002_Xfe90b7_X2abs_line.ms', 'uid___A002_Xfe90b7_X7016s_line.ms']
This is directly from the ALMA pipeline. Anyone know why the pipeline is using different filenames for this field?
"uid___A002_Xfe90b7_X2ab_targets_line.ms",
"uid___A002_Xfe90b7_X7016_targets_line.ms"
I guess the solution is to override those filenames?
These are the files that do exist:
uid___A002_Xfe90b7_X2ab.ms
uid___A002_Xfe90b7_X2ab_target.ms
uid___A002_Xfe90b7_X7016.ms
uid___A002_Xfe90b7_X7016_target.ms
@keflavich from the PL guide:
After this step (
uvcontsub
), the original continuum + line emission is contained in theDATA
column of the input MS called*_targets.ms
, while the continuum subtracted data are written to theDATA
column of the new*_targets_line.ms
So if you're using the latest PL version, then we will need to use the *line.ms
files for the cubes. But if you're using the previous PL version, then just overriding the names should hopefully work?
My understanding is that this new naming convention is in preparation for self-cal eventually being built in to the PL. This way, they have separate MSes for cont-sub and non-cont-sub data, in which the CORRECTED column is now free for self-calibrated data to be stored. So this change does serve a purpose, but for our data this inconsistency between cycles is a bit annoying.
This is odd; I haven't changed the pipeline version.
We can work around the name change easily enough if this really is just a name change. The problem is that the _line.ms
files do not exist.
These data look fine on disk now, the problems with MS files are apparently resolved.
There are some divergence issues in the CS cube.
Thanks @keflavich. I can take on the recleaning of SPW 33 since I'm assigned to it. I'll set it running soon.
spw33+35 continuum need reimaging
@keflavich have you triggered the 33+35 re-imaging yet? If not, SPW 33 could do with a higher cyclefactor
to avoid the aforementioned divergence. I started running this a while back on my end, but it looks like it failed for some reason. If you've already started with re-imaging with the default cyclefactor
value, then I can still re-do the SPW 33 imaging on my end.
yes. It completed successfully.
Note that the reimaging was continuum-only; we still need to redo spw33 with higher cyclefactor.
Ah yeah, sorry I misinterpreted the 'spw33+35' in these cases. Thanks for clarifying.
@keflavich if I update the cleaning parameters, would it be possible to trigger the SPW 33 recleaning on hipergator? It's difficult for me to work on this at the moment because of the ongoing impact of the cyberattack.
Yes, just push the updated parameters.
@keflavich change added to PR #367. Once merged, please move the old SPW 33 data and re-run the imaging for this SPW.
Relevant directory is /rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xac/calibrated/working/
Thanks!
SPW 33 still diverged with cyclefactor=2.5
(~ chans 630 - 637). It's not as severe as previously, but still poor. @keflavich I'll bump this up in another PR and let you know so we can re-run.
@keflavich cyclefactor increased in PR #376. Once merged, could you please re-run for SPW 33. Again, relevant directory is /rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xac/calibrated/working/
Rerun triggered. Old files in 20230815_spw33
@keflavich not seeing any new files for this one either?
Had one subset fail:
7312994_4 c_TM1_spw33_cube_arr astronomy-dept astronomy-dept-b FAILED 5097 128G 16 00:05:52 00:00:22 4-00:00:00 c25a-s[17-18]
Looks like manual intervention is needed 😦
$ less *7312994_4*
Log file is casa_log_line_c_TM1_spw33_cube_7313443_4_2023-08-22_18_18_27.log
Using configuration file ~/.casa/config.py
Using matplotlib backend: TkAgg
CASA 6.4.3.2 -- Common Astronomy Software Applications [6.4.3.2]
Log file name is casa_log_line_c_TM1_spw33_cube_7313443_4_2023-08-22_18_18_27.log
Traceback (most recent call last):
File "/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casashell/private/init_system.py", line 238, in __evprop__
exec(stmt)
File "<string>", line 1, in <module>
File "/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/site-packages/casashell/private/init_system.py", line 175, in execfile
newglob = run_path( filename, init_globals=globals )
File "/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "/blue/adamginsburg/adamginsburg/casa/casa-6.4.3-2-pipeline-2021.3.0.17/lib/py/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/blue/adamginsburg/adamginsburg/ACES/workdir/c_spw33_cube_TM1_A001_X15a0_Xac/uid___A001_X15a0_Xac.s38_0.Sgr_A_star_sci.spw33.cube.I.iter1_parallel_script.py", line 48, in <module>
raise ValueError(f"{tclean_kwargs['imagename']}.residual exists. Current state unclear.")
ValueError: uid___A001_X15a0_Xac.s38_0.Sgr_A_star_sci.spw33.cube.I.iter1.0512.128.residual exists. Current state unclear.
HPG clean of SPW 33 has finished. Downloading now to inspect.
SPW 33 looks good now, setting this one to done.
It looks like the reprocessed SPW 33 has multiple beams and thus a varying resolution:
What should we do about this?
We can just re-convolve the model with a single beam - we've done this before. But I'll have to dig up the code and/or rewrite it; might be worth searching other issues for bread crumbs pointing to the solution.
QA for latest continuum selection/imaging:
Contamination in SPWs 25 & 27 (cubes left, spw25_27 cont right)
SPW 25
SPW 27
New continuum images look marginally improved for low freq SPWs, but overall emission structure looks similar, particularly in high freq SPWs.
The two locations that I highlighted in the previous comment are noticeably fainter at least.
Left: old images Right: new images
Sgr_A_st_c_03_TM1 uid://A001/X15a0/Xac
[x] Observations completed?
[x] Delivered?
[x] Downloaded? (specify where)
[x] Weblog unpacked
[ ] Weblog Quality Assessment?
Extra Weblog Sgr_A_st_c_03_TM1_0 -> pipeline-20221007T100340, Extra Weblog Sgr_A_st_c_03_TM1_1 -> pipeline-20221110T050324
[x] Imaging: Continuum
[x] Imaging: Lines
Product Links:
Reprocessed Product Links: