e-merlin / eMERLIN_CASA_pipeline

This is CASA eMERLIN pipeline to calibrate data from the e-MERLIN array. Please fork the repository before making any changes and read the Coding Practices page in the wiki. Please add issues with the pipeline in the issues tab.
GNU General Public License v3.0
14 stars 11 forks source link

Source check vs inputs before any calibration/flagging #95

Closed jradcliffe5 closed 5 years ago

jradcliffe5 commented 6 years ago

Just been running the pipeline and I made a mistake with inputting a source name into the inputs file.

It would be good that once the measurement set exists to check the inputs vs. the FIELD inputs of the measurement set. I guess this should be run after importfitsIDI or if the pipeline is run from an intermediate step. This avoids tasks like applycal not applying the solutions to the data but still not crashing.

jmoldon commented 6 years ago

Yes, that already happens. Every time a MS is used (the original MS, the MMS in case it is produced, and the averaged MS/MMS in case it is produced) the pipeline checks all the sources in the MS and in the inputs file. Then it will tell the user when a source is specified but does not exist, showing a WARNING in the log and also a comment in the weblog/Observation summary/sources. The same weblog will also show if a source is in the MS but was not specified in the inputs file, showing an empty intent (that is a feature because you may decide to process some pairs of phasecal/target, but not all the sources in the MS, if you want).

So we can discuss what to do in the case a source name is specified wrongly. We can stop the pipeline as soon as the mismatch with the names appears. That means stopping the pipeline when that WARNING is issued. Or we can do like now, we run all the other steps (plots, calibration, etc) normally, and then crash when the missing source is needed: applycal if the target is wrong or any calibration step if the phasecal is wrong. That last option is what should be happening now.

jradcliffe5 commented 6 years ago

Ok sure. That’s good. I agree, I think we could do it so that it stops if the any of the sources specified are not in the FIELD table but not necessarily require that all of the sources are specified in the input file. As you said, that means that the user could still specify sub-sets of sources too.

jmoldon commented 6 years ago

I was tempted to stop the pipeline immediately after it sees a source name wrong, that means just after importfitsidi, the minimum to know which sources are there.

The reason to keep it working is that it could be annoying if someone runs the pipeline overnight just to discover the next day that it only imported the data and then stopped, when there are many time-consuming steps that do not care about the sources (like fixvis, hanning, aoflagger, plot_data) and could have been ready regardless of the source names.

I think the best solution is to keep the current warning and run everything up to flag_apriori, which is the first task that tries to use source names (to flag antenna slewing). If any name is wrong, kill the pipeline at that step. And if that step is not selected, keep working until calibration starts.

jradcliffe5 commented 6 years ago

Ok great, that sounds good to me.

jmoldon commented 5 years ago

Resolved by #100