Slow running of P070+29

vmahat commented 5 months ago

Hello,

I've been running the ddf-pipeline with standard LoTSS settings on P070+29.

It seems to hang, however, in one of the DI steps during deconvolution (image_ampphase1_di). The longest I've left this running is about 4 days during which nothing is written on disk -- I could wait a bit longer but this is probably an issue.

I've been running the pipeline on the Herts cluster on one of the large nodes.

I've attached the latest log file and my config file for reference -- I noticed that the nearby P067+29 has been processed before. The only difference between the cfg file used for that pointing and P070+29 is the addition of NVSS for flux bootstrapping. There was a warning of diverging flux at the end of some of the major cycles, so I may re-run with NVSS information included as for P067+29. Note both of these fields contain the extremely bright 3C 123. Update on this will be posted, but any tips on the deconvolution hanging without progressing would be appreciated.

image_ampphase1_di.log tier1-jul2018_vijay.log

vmahat commented 5 months ago

Here are the corresponding bootstrap images with my P070+29 run without NVSS (left) and that from the P067+29 with NVSS in /beegfs/car/mjh/P067+29/ (right), centred on 3C 123.

mhardcastle commented 5 months ago

Hi Vijay, well I think it's clear why the deconvolution doesn't play nicely... 3C123's calibration is sufficiently screwed up that the mask is most likely filled with garbage.

The question is whether there's anything you can do to rescue this. Does the initial image (image_dirin_SSD_m.app.restored.fits) look comparable in quality to the one from the adjacent field?

vmahat commented 5 months ago

It looks like this (my one on the left), so I guess the error doesn't originate from the flux corrections

mhardcastle commented 5 months ago

Yes, looks like the error is there in initial calibration. Could be that the sky model used for 3C123 in init cal is just not good enough...

vmahat commented 5 months ago

Is this an external model or that from image_dirin_SSD.app.model.fits ? The latter looks sensible to me

mhardcastle commented 5 months ago

I meant the sky model used in initial calibration with LINC (TGSS model probably). We redo the DI calibration but there's only so much you can fix...

vmahat commented 5 months ago

There are a few things I want to try so will keep this open until they're resolved

twshimwell commented 5 months ago

I think this LINC change that is coming soon (https://git.astron.nl/RD/LINC/-/merge_requests/175) might be pretty good for this field. Alternatively supplying your own model to LINC.

vmahat commented 5 months ago

Ah yes, this came about due to some improvements in the long baseline imaging with that fix. I'll have a go with that when it's merged

vmahat commented 4 months ago

Small update on this: hearing some rumours on the problems of demixing TauA, I processed the data through LINC without demixing TauA, and the resulting image_dirin_SSD_m.app.restored.fits looks appropriate (left) vs with demixing (right).

There are still artefacts/ripples across the field, that are probably due to the presence of TauA so this isn't the optimal solution -- given I'm interested in the long baselines I think this is okay for now.

cyriltasse commented 4 months ago

@vmahat There a simple option now in kMS to subtract all ATeam(+Sun) sources in one go, would you be willing to give it a try and compare?

vmahat commented 4 months ago

Sure, how does one specify this in the config?

mhardcastle / ddf-pipeline

Slow running of P070+29 #344