Open keflavich opened 2 years ago
All SPWs binned x2 #179 Otherwise, the data look good.
In the line mosaics, it looks like these data are in a broken state. All of the images look blank. @bazarsen could you have another look at these? I'm going to delete the bad ones.
I also need to delete the entire calibrated directory and re-run the pipeline
The pipeline failures included:
2022-07-10 04:39:15 INFO MSTransformManager::createOutputMSStructure Create output MS structure
2022-07-10 04:39:16 SEVERE mstransform::::casa Task mstransform raised an exception of class RuntimeError with the following message: Desired column (CORRECTED_DATA) not found in the input MS (/orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms).
2022-07-10 04:39:16 INFO mstransform::::casa Task mstransform complete. Start time: 2022-07-10 00:39:14.894255 End time: 2022-07-10 00:39:15.828004
2022-07-10 04:39:16 INFO mstransform::::casa ##### End Task: mstransform #####
2022-07-10 04:39:16 INFO mstransform::::casa ##########################################
2022-07-09 12:33:23 INFO flagmanager::::casa ##########################################
2022-07-09 12:33:23 INFO flagmanager::::casa ##### Begin Task: flagmanager #####
2022-07-09 12:33:23 INFO flagmanager::::casa flagmanager( vis='uid___A002_Xf8f6a9_X9e67.ms', mode='restore', versionname='Pipeline_Final', oldname='', comment='', merge='replace' )
2022-07-09 12:33:23 INFO flagmanager::AgentFlagger::open Table type is Measurement Set
2022-07-09 12:33:23 INFO flagmanager::::casa Restore flagversions Pipeline_Final
2022-07-09 12:33:23 SEVERE AgentFlagger::restoreFlagVersion (file src/code/flagging/Flagging/AgentFlagger.cc, line 1001) Could not restore Flag Version : ScalarColumn::putColumn(Vector&): Table conformance error (#rows mismatch)
2022-07-09 12:33:24 SEVERE agentflagger:: (file src/tools/agentflagger/agentflagger_cmpt.cc, line 35) Exception Reported: ScalarColumn::putColumn(Vector&): Table conformance error (#rows mismatch)
2022-07-09 12:33:24 SEVERE flagmanager::::casa Task flagmanager raised an exception of class RuntimeError with the following message: ScalarColumn::putColumn(Vector&): Table conformance error (#rows mismatch)
2022-07-09 12:33:24 INFO flagmanager::::casa Task flagmanager complete. Start time: 2022-07-09 08:33:22.556263 End time: 2022-07-09 08:33:23.744330
2022-07-09 12:33:24 INFO flagmanager::::casa ##### End Task: flagmanager #####
2022-07-09 12:33:24 INFO flagmanager::::casa ##########################################
which is an identical failure mode to #132
Nov 14 pipeline run:
2022-11-11 05:37:39 INFO mstransform::::casa ##########################################
2022-11-11 05:37:39 INFO mstransform::::casa ##### Begin Task: mstransform #####
2022-11-11 05:37:39 INFO mstransform::::casa mstransform( vis='uid___A002_Xf8f6a9_X9e67.ms', outputvis='uid___A002_Xf8f6a9_X9e67_target.ms', createmms=False, separationaxis='auto', numsubms='auto', tileshape=[0], field='3,4,5,6,
7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87
,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143', spw='25,27,29,31,33,35',
scan='', antenna='', correlation='', timerange='', intent='OBSERVE_TARGET#ON_SOURCE', array='', uvrange='', observation='', feed='', datacolumn='corrected', realmodelcol=False, keepflags=True, usewtspectrum=False, combinespws=False, chana
verage=False, chanbin=1, hanning=False, regridms=False, mode='channel', nchan=-1, start=0, width=1, nspw=1, interpolation='linear', phasecenter='', restfreq='', outframe='', veltype='radio', preaverage=False, timeaverage=False, timebin='0s
', timespan='', maxuvwdistance=0.0, docallib=False, callib='', douvcontsub=False, fitspw='', fitorder=0, want_cont=False, denoising_lib=True, nthreads=1, niter=1, disableparallel=False, ddistart=-1, taql='', monolithic_processing=False, re
index=False )
2022-11-11 05:37:39 INFO MSTransformManager::parseMsSpecParams Input file name is uid___A002_Xf8f6a9_X9e67.ms
2022-11-11 05:37:39 INFO MSTransformManager::parseMsSpecParams Data column is CORRECTED
2022-11-11 05:37:39 INFO MSTransformManager::parseMsSpecParams Output file name is uid___A002_Xf8f6a9_X9e67_target.ms
2022-11-11 05:37:39 INFO MSTransformManager::parseMsSpecParams Re-index is disabled
2022-11-11 05:37:39 INFO MSTransformManager::parseMsSpecParams Tile shape is [0]
2022-11-11 05:37:39 INFO MSTransformManager::parseDataSelParams field selection is 3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54
,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125
,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143
2022-11-11 05:37:39 INFO MSTransformManager::parseDataSelParams spw selection is 25,27,29,31,33,35
2022-11-11 05:37:39 INFO MSTransformManager::parseDataSelParams scan intent selection is OBSERVE_TARGET#ON_SOURCE
2022-11-11 05:37:39 WARN MSTransformManager::checkDataColumnsToFill CORRECTED_DATA column requested but not available in input MS
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams Selected SPWs Ids are Axis Lengths: [6, 4] (NB: Matrix in Row/Column order)
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ [25, 0, 1919, 1
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ 27, 0, 1919, 1
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ 29, 0, 1919, 1
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ 31, 0, 1919, 1
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ 33, 0, 3839, 1
2022-11-11 05:37:39 INFO MSTransformManager::initDataSelectionParams+ 35, 0, 3839, 1]
2022-11-11 05:37:39 INFO MSTransformManager::open Select data
2022-11-11 05:37:39 INFO MSTransformManager::createOutputMSStructure Create output MS structure
2022-11-11 05:37:39 SEVERE mstransform::::casa Task mstransform raised an exception of class RuntimeError with the following message: Desired column (CORRECTED_DATA) not found in the input MS (/orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms).
2022-11-11 05:37:39 INFO mstransform::::casa Task mstransform complete. Start time: 2022-11-11 00:37:39.255101 End time: 2022-11-11 00:37:39.405362
2022-11-11 05:37:39 INFO mstransform::::casa ##### End Task: mstransform #####
2022-11-11 05:37:39 INFO mstransform::::casa ##########################################
so it looks like this entire MOUS is a failure. I'm going to delete everything and try to get it to go fresh.
Nov 25 pipeline run ends the same way:
2022-11-21 05:34:31 SEVERE mstransform::::casa Task mstransform raised an exception of class RuntimeError with the following message: Desired column (CORRECTED_DATA) not found in the input MS (/orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xb2/calibrated/working/uid___A002_Xfe3986_X9083.ms).
Interestingly, this happens on line 1436, which doesn't crash the pipeline - it just keeps going. I think the pipeline should fail at this point.
I need some expert help here - @d-l-walker @piposona @pyhsiehATalma, this looks like a pipeline problem to me. Can any of you successfully restore the data? If so, what are you doing differently?
The first 10,000 lines of the log file are here: (I didn't post the full log because it's >100 MB) casa_log_mpi_pipeline_imaging_member.uid___A001_X15a0_Xb2_52087485_2022-11-21_00_02_47.first10000lines.log
Data check:
(python39) login4.ufhpc /orange/adamginsburg/ACES/data$ ls -lh *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787*
-rw-r--r-- 1 adamginsburg adamginsburg 60G Nov 15 22:40 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 60G Nov 16 02:37 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 118G Oct 26 08:19 corrupt_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 65G Oct 26 11:06 corrupt_2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
(python39) login4.ufhpc /orange/adamginsburg/ACES/data$ md5sum *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787*
d78bebe11bcde4d22ce3697ae9f5caf4 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
bef9fc5181ad37f94b96460b92ad9116 corrupt_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
1c7551e38e3378825187314b0438787f 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
Hi @keflavich -- I can try to take a look at this next week. There's a scheduled electrical shutdown at Manchester this weekend, so there's no point in me setting any jobs running now.
My first thought looking at the log file is that you're using CASA Version PIPELINE 6.4.3.2, whereas the delivered data were processed using CASA Version 6.2.1.7. Maybe try restoring the calibration using this pipeline version to rule out whether that could be an issue?
I can try that.
Could you (anyone) verify the MD5sums of the ASDM files by downloading a fresh copy?
Hi @keflavich, just want to know which asdm you mentioned? The MOUS (X15a0_Xb2) of the log file is Sgr_A_st_d_03_TM1, but this is a issue of Sgr_A_st_g_03_TM1
(casa_log_mpi_pipeline_imaging_member.uid___A001_X15a0_Xb2_52087485_2022-11-21_00_02 _47.first10000lines.log))
Sgr_A_st_g_03_TM1 globus ls $hpg:\~$hpg_pth"member.uid_A001_X15a0Xc4/raw" uidA002_Xf8f6a9_X113a4.asdm.sdm/ uid___A002_Xf8f6a9_X9e67.asdm.sdm/
Sgr_A_st_d_03_TM1 globus ls $hpg:\~$hpg_pth"member.uid_A001_X15a0Xb2/raw" uidA002_Xfe3986_X9083.asdm.sdm/ uid___A002_Xfe62c1_X1871.asdm.sdm/
which SB of these two corrupted files? 2021.1.00172.L_uid_A002_Xfed4ee_X1e3.asdm.sdm.tar 2021.1.00172.LuidA002_Xfee03e_X2787.asdm.sdm.tar
Removed calibrated/
directory.
All of these files were reimaged earlier in the month, but they still appear to be junk:
$ ls -lhrtd /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s*_0.Sgr_A_star_sci.spw*.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 13 23:57 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s12_0.Sgr_A_star_sci.spw25.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 14 15:31 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s12_0.Sgr_A_star_sci.spw27.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 15 08:35 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s12_0.Sgr_A_star_sci.spw29.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 17 19:00 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s38_0.Sgr_A_star_sci.spw31.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 18 06:09 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s38_0.Sgr_A_star_sci.spw35.cube.I.iter1.image
drwxrwsr-x+ 4 adamginsburg adamginsburg 4.0K Jan 18 21:27 /orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A001_X15a0_Xc4.s38_0.Sgr_A_star_sci.spw33.cube.I.iter1.image
EDIT: but these images were produced before the latest download of ASDMs:
$ ls -lhrtd *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787*
-rw-r--r-- 1 adamginsburg adamginsburg 118G Oct 26 08:19 corrupt_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 65G Oct 26 11:06 corrupt_2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 60G Nov 15 22:40 corrupt_Jan2023_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 60G Nov 16 02:37 corrupt_Jan2023_2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 62G Jan 18 14:07 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 61G Jan 18 15:36 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
$ md5sum *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787*
0b7379e1373119e2000a4f8cf1c4819b 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
bef9fc5181ad37f94b96460b92ad9116 corrupt_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
d78bebe11bcde4d22ce3697ae9f5caf4 corrupt_Jan2023_2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
a43ea10b31b9d132d17f0562207db93a 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
ac82e935b587a3cfb868483b84ba16ef corrupt_2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
1c7551e38e3378825187314b0438787f corrupt_Jan2023_2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
I re-extracted the tarballs. The pipeline failed, however, with:
2023-02-13 09:59:48 INFO applycal::::casa ##########################################
2023-02-13 09:59:48 INFO applycal::::casa ##### Begin Task: applycal #####
2023-02-13 09:59:48 INFO applycal::::casa applycal( vis='uid___A002_Xf8f6a9_X9e67.ms', field='3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143', spw='25,27,29,31,33,35', intent='OBSERVE_TARGET#ON_SOURCE', selectdata=True, timerange='', uvrange='', antenna='*&*', scan='', observation='', msselect='', docallib=False, callib='', gaintable=['uid___A002_Xf8f6a9_X9e67.ms.hif_uvcontfit.s5_1.Sgr_A_star.uvcont.tbl'], gainfield=[], interp=[], spwmap=[], calwt=[False], parang=False, applymode='calflag', flagbackup=True )
2023-02-13 09:59:48 INFO applycal::calibrater::open ****Using NEW VI2-driven calibrater tool****
2023-02-13 09:59:48 INFO applycal::calibrater::open Opening MS: uid___A002_Xf8f6a9_X9e67.ms for calibration.
2023-02-13 09:59:48 INFO applycal::VisSetUtil::addScrCols Adding CORRECTED_DATA column(s).
2023-02-13 10:00:18 INFO applycal:::: Process 40487: waiting for write-lock on file /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms/table.lock
i.e., it locked itself out of running the pipeline. I think the only thing to do with this is delete it and start over.
Pipeline rerun triggered. Let's see if it gets past that point this time.
I think there were multiple clones of the pipeline running that managed to conflict and create that write lock. Not sure how that's even possible since the pipeline shouldn't be able to start if the calibrated/ directory exists, but race conditions happen.
Pipeline failed again at
2023-02-15 17:14:23 INFO: Restoring calibration state for uid___A002_Xf8f6a9_X9e67.ms from ../rawdata/uid___A002_Xf8f6a9_X9e67.ms.calapply.txt
2023-02-15 17:14:23 INFO: Importing calibration state from /scratch/local/57260167/tmp1yasll_n
ESC[33m2023-02-15 17:14:23 WARNING: Could not access uid___A002_Xf8f6a9_X9e67.ms.hif_uvcontfit.s5_1.Sgr_A_star.uvcont.tbl. Using heuristics to determine caltable typeESC[0m
which suggests a problem with the auxiliary tarball. I'm removing and re-downloading everything:
$ ls -lhrt *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787* *Xc4*
-rw-r--r-- 1 adamginsburg adamginsburg 146G Jun 23 2022 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
-rw-r--r-- 1 adamginsburg adamginsburg 749M Jun 23 2022 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
-rw-r--r-- 1 adamginsburg adamginsburg 3.5K Aug 23 05:54 member.uid___A001_X15a0_Xc4.README.txt
-rw-r--r-- 1 adamginsburg adamginsburg 3.7G Dec 17 02:14 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 62G Jan 18 14:07 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 61G Jan 18 15:36 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
ok. so. I downloaded everything again. Extracted it all again. Total fresh start of everything. This again:
2023-02-16 09:34:28 INFO applycal:::: Process 57713: waiting for write-lock on file /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms/table.lock
???
Just to check file integrity:
$ ls -lhrt *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787* *Xc4*
-rw-r--r-- 1 adamginsburg adamginsburg 60G Feb 16 01:48 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 60G Feb 16 04:03 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 3.7G Feb 16 04:13 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 3.5K Feb 16 04:14 member.uid___A001_X15a0_Xc4.README.txt
-rw-r--r-- 1 adamginsburg adamginsburg 749M Feb 16 10:39 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
-rw-r--r-- 1 adamginsburg adamginsburg 149G Feb 16 10:41 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
$ md5sum *uid___A002_Xfed4ee_X1e3* *uid___A002_Xfee03e_X2787* *Xc4*
7f699bcd2e75c1ed5737e75a720f98d0 2021.1.00172.L_uid___A002_Xfed4ee_X1e3.asdm.sdm.tar
61f181a22f41300c2f24c2811db11e77 2021.1.00172.L_uid___A002_Xfee03e_X2787.asdm.sdm.tar
9723ccfc4279aa66d74a311a0dfb5286 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
efd5d049e16d153a86a0efa4ed9a654c 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
d438a1777c13584d97e94cfd192b4340 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
45243dd5a6a80b5da300bb9d2ff21537 member.uid___A001_X15a0_Xc4.README.txt
It looks like I got the ASDM names wrong somehow.
From the pipeline run, I see that there are ASDMs from:
2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar
which do not match the IDs of those I posted above. It was probably a copy-paste error.
$ ls -lh 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Jul 5 2022 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Jul 5 2022 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
$ md5sum 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
3855bd5564fa38e500b1718452bf00c4 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
3855bd5564fa38e500b1718452bf00c4 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
mv 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar bad_tarballs/
Continuum imaging might be OK? There's virtually no signal; the brightest peak is ~0.3 mJy.
Despite that continuum looking OK, the g
pipeline run failed with a timeout/writelock failure. There may still be good images in failed_member.uid___A001_X15a0_Xc4_20230224
but the failed pipeline is a red flag that can't be ignored:
2023-02-20 09:00:42 INFO applycal::::casa ##########################################
2023-02-20 09:00:42 INFO applycal::::casa ##### Begin Task: applycal #####
2023-02-20 09:00:42 INFO applycal::::casa applycal( vis='/orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67_target.ms', field='Sgr_A_star', spw='25,27,29,31,33,35', intent='OBSERVE_TARGET#ON_SOURCE', selectdata=True, timerange='', uvrange='', antenna='*&*', scan='', observation='', msselect='', docallib=False, callib='', gaintable=['uid___A002_Xf8f6a9_X9e67_target.ms.hif_uvcontfit.s5_1.Sgr_A_star.uvcont.tbl'], gainfield=[], interp=[], spwmap=[], calwt=[False], parang=False, applymode='calflag', flagbackup=True )
2023-02-20 09:00:42 INFO applycal::calibrater::open ****Using NEW VI2-driven calibrater tool****
2023-02-20 09:00:42 INFO applycal::calibrater::open Opening MS: /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67_target.ms for calibration.
2023-02-20 09:00:42 INFO applycal::VisSetUtil::addScrCols Adding CORRECTED_DATA column(s).
2023-02-20 09:01:11 INFO applycal:::: Process 128170: waiting for write-lock on file /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67_target.ms/table.lock
Rerun will start with:
ls -lh 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar *Xc4*tar
-rw-r--r-- 1 adamginsburg adamginsburg 149G Feb 16 10:41 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
-rw-r--r-- 1 adamginsburg adamginsburg 749M Feb 16 10:39 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Feb 18 18:52 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Feb 18 18:52 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 3.7G Feb 16 04:13 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
A new data point for tracing this down: the applycal task in the above message used the full path. At least one other example of applycal that was successful did not use the full path.
I typo'd above and did not successfully re-download the second ASDM, which caused layered problems.
:cry:
2023-02-24 22:48:26 INFO applycal::::casa ##########################################
2023-02-24 22:48:26 INFO applycal::::casa ##### Begin Task: applycal #####
2023-02-24 22:48:26 INFO applycal::::casa applycal( vis='uid___A002_Xf8f6a9_X9e67.ms', field='3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143', spw='25,27,29,31,33,35', intent='OBSERVE_TARGET#ON_SOURCE', selectdata=True, timerange='', uvrange='', antenna='*&*', scan='', observation='', msselect='', docallib=False, callib='', gaintable=['uid___A002_Xf8f6a9_X9e67.ms.hif_uvcontfit.s5_1.Sgr_A_star.uvcont.tbl'], gainfield=[], interp=[], spwmap=[], calwt=[False], parang=False, applymode='calflag', flagbackup=True )
2023-02-24 22:48:26 INFO applycal::calibrater::open ****Using NEW VI2-driven calibrater tool****
2023-02-24 22:48:26 INFO applycal::calibrater::open Opening MS: uid___A002_Xf8f6a9_X9e67.ms for calibration.
2023-02-24 22:48:26 INFO applycal::VisSetUtil::addScrCols Adding CORRECTED_DATA column(s).
2023-02-24 22:48:59 INFO applycal:::: Process 77329: waiting for write-lock on file /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms/table.lock
$ ls -lh 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar *Xc4*tar
-rw-r--r-- 1 adamginsburg adamginsburg 149G Feb 16 10:41 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
-rw-r--r-- 1 adamginsburg adamginsburg 749M Feb 16 10:39 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Feb 25 01:24 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 59G Feb 18 18:52 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
-rw-r--r-- 1 adamginsburg adamginsburg 3.7G Feb 16 04:13 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
$ md5sum 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar *Xc4*tar
ef7be11430edc54ae8f5ec847f25cf78 2021.1.00172.L_uid___A002_Xf8f6a9_X113a4.asdm.sdm.tar
fe73d03b5f47fbb24caa69c33a84ccaf 2021.1.00172.L_uid___A002_Xf8f6a9_X9e67.asdm.sdm.tar
9723ccfc4279aa66d74a311a0dfb5286 2021.1.00172.L_uid___A001_X15a0_Xc4_001_of_001.tar
efd5d049e16d153a86a0efa4ed9a654c 2021.1.00172.L_uid___A001_X15a0_Xc4_auxiliary.tar
d438a1777c13584d97e94cfd192b4340 2021.1.00172.L_uid___A002_Xfe90b7_Xc423.asdm.sdm.tar
next approach:
This run appears to be successful:
-rw-r--r-- 1 adamginsburg adamginsburg 36K Feb 25 21:14 casa_log_mpi_pipeline_58429783_2023-02-25_21_14_17.log
-rw-r--r-- 1 adamginsburg adamginsburg 860K Feb 26 09:34 casa_log_mpi_pipeline_member.uid___A001_X15a0_Xc4_58429783_2023-02-25_21_14_43.log
-rw-r--r-- 1 adamginsburg adamginsburg 1.2M Feb 26 09:34 run_pipeline_mpi_58429783.log
At least, there are no errors. The corresponding interactive run failed b/c plotms doesn't work on hipergator and I didn't apply the hack before running it.
Next step is to try imaging.
These are the mses:
drwxrwsr-x+ 28 adamginsburg adamginsburg 4.0K Feb 25 22:24 /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67.ms
drwxrwsr-x+ 28 adamginsburg adamginsburg 4.0K Feb 25 23:00 /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X113a4.ms
drwxrwsr-x+ 28 adamginsburg adamginsburg 4.0K Feb 26 00:27 /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67_target.ms
drwxrwsr-x+ 28 adamginsburg adamginsburg 4.0K Feb 26 00:30 /orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X113a4_target.ms
Continuum is good, lines are good. 25 and 27 are still going, though.
25 died with a weird failure, and now is dying on startup with a major and unacceptable error:
2023-03-31 19:55:25 INFO split::::casa+ RuntimeError: Desired column (CORRECTED_DATA) not found in the input MS (/orange/adamginsburg/ACES/rawdata/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_Xc4/calibrated/working/uid___A002_Xf8f6a9_X9e67_target.ms).
I don't know where this is coming from, there's no reason for these scripts to have changed
This SB still needs to be updated to undo size mitigation (see #179).
QA - Line contamination in continuum images from high/low frequencies
Looks okay? Maybe some contamination in spw25_27. Several compact sources are visible in spw25_27, but not in spw33_35.
QA - Line contamination in continuum images from high/low frequencies (compared againts v1.1)
Looks great.
Reminder that SPWs 25,27,29,31,35 all need to be un-size-mitigated and re-stat-cont-ed @keflavich
Moved files: mv *spw{25,27,29,31,35}* sizemitigated/
. Rerun forthcoming
Sgr_A_st_g_03_TM1 uid://A001/X15a0/Xc4
[x] Observations completed?
[x] Delivered?
[x] Downloaded? (specify where)
[ ] Weblog unpacked
[ ] Weblog Quality Assessment?
Extra Weblog Sgr_A_st_g_03_TM1_0 -> pipeline-20220710T040320, Extra Weblog Sgr_A_st_g_03_TM1_2 -> pipeline-20220607T145537, Extra Weblog Sgr_A_st_g_03_TM1_1 -> pipeline-20220607T145537
[ ] Imaging: Continuum
[ ] Imaging: Lines
Product Links:
Reprocessed Product Links: