ACES-CMZ / reduction_ACES

Reduction scripts and tools for ACES
https://worldwidetelescope.org/webclient/?wtml=https://data.rc.ufl.edu/pub/adamginsburg/ACES/mosaics/mosaics.wtml
15 stars 12 forks source link

Execution Block ID uid://A001/X15a0/X190 Sgr_A_st_ao_03_TM1 #41

Open keflavich opened 2 years ago

keflavich commented 2 years ago

Sgr_A_st_ao_03_TM1 uid://A001/X15a0/X190

Product Links:

Reprocessed Product Links:

ashleythomasbarnes commented 2 years ago

Compact bright spots in swp 25 - more SiO masers?

image

nbudaiev commented 2 years ago

findcont: spw33 has LowBW warning. No visible issues with other SPWs. uid___A001_X15a0_X190 s31_0 Sgr_A_star_sci spw33 mfs I findcont residual autoLowerIntersect meanSpectrum amendedJointMask min min 1 28sigma narrow4 trimauto_max=0 1 overwriteTrue-2 Full continuum: no issues.

SPW cubes: spw31: some structure in residuals: uid___A001_X15a0_X190 s38_0 Sgr_A_star_sci spw31 cube I iter0 residual sky spw31: some structure in line-free mom8: uid___A001_X15a0_X190 s38_0 Sgr_A_star_sci spw31 cube I iter1 image mom8_fc sky-2 spw35: tclean stopped to prevent divergence (stop code 5). Field: Sgr_A_star SPW: 35 uid___A001_X15a0_X190 s38_0 Sgr_A_star_sci spw35 cube I iter1 image sky No issues with other SPWs.

nbudaiev commented 2 years ago

spw35 cube divergence: Divergence happened only in channel 515. The rest of the cube looks good.

Screen Shot 2022-02-23 at 10 59 37 AM
d-l-walker commented 2 years ago

QA warnings

Nothing major here. Main thing is that SPW 35 diverged an will need to be recleaned.

Continuum

uid___A001_X15a0_X190 s36_0 Sgr_A_star_sci spw25_27_29_31_33_35 cont I iter1 image tt0 sky

Cubes

Summary

Data look really good. Reclean SPW 35 Re-run pipeline without size mitigation All cubes will benefit a lot from combination with 7m & TP

keflavich commented 2 years ago

@piposona could you upload the completed hi-res reimaging products?

piposona commented 2 years ago

Uploaded to /upload/Repipelined_member.uid___A001_X15a0_X190/ Let me know if that is ok and I will delete it from our cluster.

keflavich commented 2 years ago

At a glance, it looks good! I'll move it over into the right directories soon

pyhsiehATalma commented 2 years ago

Hi, I recleaned spw 35 of 12 m EB uid://A001/X15a0/X190. The divergence in channel 515 was resolved. The results look fine. I will upload the reclean cube.

Left: pipeline product, Right: reclean

X190-2 X190-1 X190

pyhsiehATalma commented 2 years ago

I used the pipeline selected line-free channels to subtract continuum. The tclean parameters are below (almost same as pipeline parameter except automasking parameters).


ms='concat.spw35.contsub' phasecenter = 'ICRS 17:46:03.1636 -028.39.24.691' imagename = 'uid___A001_X15a0_X190.s38_0.Sgr_A_star_sci.spw35.cube.I.iter0' restfreq = '100.5GHz' imsize = [1024, 2592] cell = '0.28arcsec'

tclean(vis = ms, imagename = imagename, field = '', intent='OBSERVE_TARGET#ON_SOURCE', phasecenter = phasecenter,, restfreq = restfreq, spw = '0~1', threshold='10mJy', imsize = imsize, cell = cell, niter = 10000000, cycleniter = 250, start = '99.5625613617GHz', width = '0.9764731MHz', nchan = 1916, outframe = 'LSRK', deconvolver = 'hogbom', weighting = 'briggsbwtaper', robust = 0.5, specmode = 'cube', restoringbeam = 'common', gridder = 'mosaic', parallel = True, usemask="auto-multithresh", sidelobethreshold = 2.0, noisethreshold = 3.25, negativethreshold = 10.0, lownoisethreshold =1.5, smoothfactor = 1.0, minbeamfrac = 0.3, cutthreshold = 0.01, growiterations = 75, interactive = False)

pyhsiehATalma commented 2 years ago

The reclean cubes were uploaded to,

/upload/uid_A001_X15a0_X190/

pyhsiehATalma commented 2 years ago

override_tclean_commands.json was updated for reclean of spw 35.

keflavich commented 2 years ago

the reclean of spw33 is disastrous - both my attempt and @pyhsiehATalma 's look like the data were totally uncalibrated. So maybe they were.

keflavich commented 2 years ago

left is the "product", right is reclean image

pyhsiehATalma commented 2 years ago

@keflavich this looks weird, I will look into spw 33.

pyhsiehATalma commented 2 years ago

@keflavich @d-l-walker

I re-cleaned spw 33. the results look like to be consistent with product. I am not sure the status of this execution blocks, but shall I update the tclean script by pulling the request? The name of re-clean cube is "...iter1..".

spw33

keflavich commented 2 years ago

The calibrated/ data for this field were still completely screwed up, so I'm re-running everything here from scratch again.

keflavich commented 1 year ago

There is a recleaned product of spw35 on disk, but its parameters are wrong. https://github.com/ACES-CMZ/reduction_ACES/pull/274 is a proposed fix. @pyhsiehATalma could you verify that these are the correct parameters? Specifically, when you re-ran the clean, what was nchan?

d-l-walker commented 1 year ago

The cube imaging parameters for SPWs 25, 27, 29, and 31 were still those for the size mitigated products. I've updated these in #300.

keflavich commented 1 year ago

Moved old cubes to cubes_pre20221116/ (including spw35 reclean, which may not be necessary). All reimaging jobs are running.

d-l-walker commented 1 year ago

@keflavich I was just about to mark this one as done, but I'm having trouble finding the cube for SPW 33. All other SPWs have been cleaned at the native spectral resolution, and can be found in the ~/member.uid___A001_X15a0_X190/calibrated/working/ folder, but I don't see SPW 33.

It's included in the override_tclean_commands file, so I'm not sure why it's not there. Is it still running after all this time? I see that the SPW 35 cube was only finished a few days ago, while the other SPWs finished back in November ...

Screenshot 2023-02-28 at 14 50 04

keflavich commented 1 year ago

It's in day 10^6 of "still imaging".

I'm growing concerned that this might not be finishing a major cycle in <96h. Might need to look into this further.

d-l-walker commented 1 year ago

😬 I've been encountering this issue with the array combination. Some of the more complex regions can take many weeks, often failing in the end anyway 🙃

keflavich commented 1 year ago

Options to consider: (1) Maybe a higher cyclethreshold to trigger the major cycle sooner, so it spends more time on the major cycle and less on the minor cycle? Seems... unlikely to work? (2) More memory? not clear this is the bottleneck (3) MPI ⚡ominous thunderclap ⚡

d-l-walker commented 1 year ago

Looks like SPW 33 still hasn't completed. Not sure if we just leave this and ... hope it finishes eventually?

keflavich commented 1 year ago

spw33:

[c0712a-s3:63904] *** Process received signal ***
[c0712a-s3:63904] Signal: Segmentation fault (11)
[c0712a-s3:63904] Signal code:  (128)
[c0712a-s3:63904] Failing at address: (nil)

😱

spw35 is plugging along happily. After 5 hours, it's 574 chunks in (is a chunk 1 channel or many? not sure, but Subcubes: 1791.)

$ fgrep "Run Major Cycle 1" *60575694* | wc
    574    5740  133742

spw33 is half as many chunks in, same number of subcubes:

$ fgrep "Run Major Cycle 1" *60575693* | wc
    286    2860   66638

and I suspect it will die.

d-l-walker commented 1 year ago

@keflavich I'm guessing still no luck with SPW 33 for this field? Did you set it to run again after the previous segfault? I think this is the only outstanding issue for this region, so hopefully we can get this sorted and get it marked as done. Let me know if there's any way I can help with it.

keflavich commented 1 year ago

Nope, it's running Major Cycle 1 again

d-l-walker commented 1 year ago

Looks like we finally have a SPW 33 cube now. Downloading now to check it out.

d-l-walker commented 1 year ago

Note that we have these two cubes for SPW 35:

uid___A001_X15a0_X190.s38_0.Sgr_A_star_sci.spw35.cube.I.iter1.image.pbcor.fits
uid___A001_X15a0_X190.s38_0.Sgr_A_star_sci.spw35.cube.I.iter1.reclean.image.pbcor.fits

I guess we want to move the old one and rename the recleaned one for consistency? @keflavich

d-l-walker commented 1 year ago

After taking an eternity to clean, SPW 33 diverged in a single channel (627) :(

This was done with the default cyclefactor value. I'll update this in a PR soon.

Screenshot 2023-08-14 at 15 29 00

d-l-walker commented 1 year ago

@keflavich SPW 33 cyclefactor increase in #377. Please re-run SPW 33 cleaning once merged. [~/member.uid___A001_X15a0_X190/calibrated/working/]

keflavich commented 1 year ago

moved files, restarted

d-l-walker commented 1 year ago

SPW 33 looks good now, no divergence. Finally marking this one as done.

keflavich commented 9 months ago

There still seems to be an issue with SPW35 having a divergent channel. The divergence should have been fixed. https://github.com/ACES-CMZ/reduction_ACES/issues/41#issuecomment-1090738721

keflavich commented 9 months ago

This is in the .reclean.image file, dated March 15, 2023: image

keflavich commented 9 months ago

As of today, the re-cleaned cubes on disk still had this divergence. So it is likely that we need to update the clean parameters and try again. I'm giving one more shot at freshly recleaning this before making that modification, though. In the parallel cleaning, the cube starting at channel 1024 has the divergent channel

keflavich commented 9 months ago

I'm boosting the cyclefactor to 2.5 for spw35 and re-running; we still have that one divergent channel 1034:

image
pyhsiehATalma commented 7 months ago

Just post the image of low-level divergence for the record (channel 645).

截圖 2024-01-31 15 11 51

keflavich commented 7 months ago

bumped cyclefactor 2.5->3.5 in https://github.com/ACES-CMZ/reduction_ACES/pull/408/commits/0051bbc07cdf131d9aab685b01f392447b644869#diff-f2ff6354caa768635d4300c5a481a90d1c95f24abf94f9a7724ba94d1cc770e4R2759

betacygni commented 6 months ago

QA - Line contamination in continuum images from high/low frequencies

Summary: both files look reasonably good (although large-scale emission is missing)

Files checked:

Results: Both spw33_35 and spw25_27 look reasonably good (no obvious contamination) There is a lot of missing flux generating negative lobes, but the structures are very consistent between the two frequency ranges

Attached image: (zoom to the bottom area of the Brick)

QA_field_ao_cont
keflavich commented 5 months ago

Apparently tclean worked when run with the previous parameter set, but now tclean consistently segfaults any time I try to run the full aggregate or the low frequency imaging.

keflavich commented 5 months ago

Maybe this has something to do with it: image there appears to be an entirely missing spectral window

keflavich commented 5 months ago

missing window was 27. Maybe remaking it fixes this? image (the yellow highlight is the old spw selection)

keflavich commented 5 months ago

reclean looks good image

keflavich commented 5 months ago

Going to try to split the measurement set by its spws:


CASA <1>: split(vis='/orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_X190/calibrated/working/uid___A002_Xf53eeb_X323e.ms',
     ...: outputvis='uid___A002_Xf53eeb_X323e_target_low.ms',
     ...: field='Sgr_A_star',
     ...: spw="25:85.96741525086293~85.96839172755016GHz;85.9693682042374~86.04358043246751GHz;86.04724222004465~86.05065988844999GHz;86.05163636513723~86.08190714244161GHz;86.08605716836236~86.15172522557914GHz;86.15270170226638~86.156363
     ...: 48984351GHz;86.15733996653076~86.17052240180847GHz;86.17149887849571~86.2020137749719GHz;86.20884911178257~86.21080206515704GHz;86.21299913770333~86.21373149521875GHz;86.21617268693686~86.21885799782676GHz;86.22544921546562~86.23
     ...: 25286714481GHz;86.23643457819705~86.2378992932279GHz;86.25108172850561~86.26328768709611GHz;86.26426416378334~86.29062903433876GHz;86.291605511026~86.29722025197762GHz;86.30332323127287~86.3042997079601GHz;86.3091820913963~86.312
     ...: 59975980163GHz;86.35507649569648~86.36020299830449GHz;86.36117947499172~86.37680310198755GHz;86.37802369784659~86.38046488956469GHz;86.38144136625192~86.38534727300087GHz;86.38632374968812~86.39730911241955GHz;86.39828558910678~8
     ...: 6.42513869800582GHz;86.42611517469307~86.43392698819098GHz,27:86.66736581487878~86.71374845752838GHz;86.71472493421574~86.71765436427782GHz;86.71863084096519~86.72717501197958GHz;86.72815148866695~86.73401034879109GHz;86.75427224
     ...: 005382~86.7547604783975GHz;86.75573695508486~86.75622519342853GHz;86.75720167011589~86.75768990845958GHz;86.75988698100613~86.76135169603718GHz;86.76232817272454~86.77013998622341GHz;86.77111646291078~86.80431667028103GHz;86.8240
     ...: 9032320007~86.82457856154375GHz;86.86534646324104~86.86900825081864GHz;86.869984727506~86.87804066017671GHz;86.87901713686408~86.88023773272327GHz;86.88145832858247~86.90611436493832GHz;86.90709084162567~86.90879967582856GHz;86.9
     ...: 0977615251592~86.98618545330186GHz;86.98716192998921~87.053074106386GHz;87.05405058307336~87.06894185255561GHz;87.06991832924297~87.07040656758664GHz;87.09457436559882~87.0950626039425GHz;87.0970155573172~87.11874216361097GHz;87.
     ...: 11971864029833~87.13094812220298GHz;87.13192459889034~87.13387755226506GHz",
     ...: )

CASA <2>: split(vis='/orange/adamginsburg/ACES/data/2021.1.00172.L/science_goal.uid___A001_X1590_X30a8/group.uid___A001_X1590_X30a9/member.uid___A001_X15a0_X190/calibrated/working/uid___A002_Xf531c1_X16b2.ms',
     ...: outputvis='uid___A002_Xf531c1_X16b2_target_low.ms',
     ...: field='Sgr_A_star',
     ...: spw="25:85.96741525086293~85.96839172755016GHz;85.9693682042374~86.04358043246751GHz;86.04724222004465~86.05065988844999GHz;86.05163636513723~86.08190714244161GHz;86.08605716836236~86.15172522557914GHz;86.15270170226638~86.156363
     ...: 48984351GHz;86.15733996653076~86.17052240180847GHz;86.17149887849571~86.2020137749719GHz;86.20884911178257~86.21080206515704GHz;86.21299913770333~86.21373149521875GHz;86.21617268693686~86.21885799782676GHz;86.22544921546562~86.23
     ...: 25286714481GHz;86.23643457819705~86.2378992932279GHz;86.25108172850561~86.26328768709611GHz;86.26426416378334~86.29062903433876GHz;86.291605511026~86.29722025197762GHz;86.30332323127287~86.3042997079601GHz;86.3091820913963~86.312
     ...: 59975980163GHz;86.35507649569648~86.36020299830449GHz;86.36117947499172~86.37680310198755GHz;86.37802369784659~86.38046488956469GHz;86.38144136625192~86.38534727300087GHz;86.38632374968812~86.39730911241955GHz;86.39828558910678~8
     ...: 6.42513869800582GHz;86.42611517469307~86.43392698819098GHz,27:86.66736581487878~86.71374845752838GHz;86.71472493421574~86.71765436427782GHz;86.71863084096519~86.72717501197958GHz;86.72815148866695~86.73401034879109GHz;86.75427224
     ...: 005382~86.7547604783975GHz;86.75573695508486~86.75622519342853GHz;86.75720167011589~86.75768990845958GHz;86.75988698100613~86.76135169603718GHz;86.76232817272454~86.77013998622341GHz;86.77111646291078~86.80431667028103GHz;86.8240
     ...: 9032320007~86.82457856154375GHz;86.86534646324104~86.86900825081864GHz;86.869984727506~86.87804066017671GHz;86.87901713686408~86.88023773272327GHz;86.88145832858247~86.90611436493832GHz;86.90709084162567~86.90879967582856GHz;86.9
     ...: 0977615251592~86.98618545330186GHz;86.98716192998921~87.053074106386GHz;87.05405058307336~87.06894185255561GHz;87.06991832924297~87.07040656758664GHz;87.09457436559882~87.0950626039425GHz;87.0970155573172~87.11874216361097GHz;87.
     ...: 11971864029833~87.13094812220298GHz;87.13192459889034~87.13387755226506GHz",
     ...: )
keflavich commented 5 months ago

The split version worked for field ak and I'm now running this imaging for ao and it's working. https://github.com/ACES-CMZ/reduction_ACES/pull/418 now incorporates this fix generally

keflavich commented 4 weeks ago

@mpound noted a problem in the continuum selection - spw25+27 was infected by an SiO maser.

The maser in question is this:

image

it is 2.8 mJy in the continuum, 2.5 Jy in the line data, which means it's diluted by ~1000x, or roughly half the total channels in the cube (which is probably roughly the number of non-flagged channels?).

There are two EBs that contribute to this image. They have wildly different continuum selections, which is not good. These are the relevant excerpts from the continuum selection:

Xf531c1_X16b2: 86.23643457819705~86.2378992932279GHz;86.25108172850561~86.26328768709611GHz
Xf53eeb_X323e: 86.2303796649618~86.237459525754GHz;86.24136565584625~86.24283045463083GHz;86.25601364369217~86.26822030023045GHz;

The first one, 16b2, excludes 86.23789 - 86.25108 - i.e., the entire line pictured above. X323e includes most of that. That's a huge problem! Why? (investigation continues...)