Indexing issue - Githubissues

aida-ahmadi commented 10 months ago

Hi John (and Patrick),

Thanks a lot for sharing this segment of the pipeline as a stand-alone code. I managed to successfully run it in CASA 6.5.4 after modifying a few lines of code in selfcal_helpers.py. I think the issue I encountered has to do with the fact that spectral windows are not always numbered starting at zero. While I know that I could renumber the spectral windows to start at zero, my cont.dat files assume the original numbering of the spectral windows so I think it's best to keep the original numbering. In case this is useful information for you, I modified the following lines in selfcal_helpers.py to index the measurement set differently:

lines 1244 and 1318 changed from spwname=msmd.namesforspws(spw)[0] to spwname=msmd.namesforspws(j)[0]

line 1399 changed from for key in contdotdat.keys(): to for k, key in enumerate(contdotdat.keys()):

and line 1401 changed from spwname=msmd.namesforspws(key)[0] to spwname=msmd.namesforspws(k)[0]

One more thing is that CASA sometimes complains at the import of msmetadata via from casatools import msmetadata as msmdtool because msmdtool is already a CASA tool that is loaded at startup. I've checked this in CASA 6.4, 6.5.4, and 6.6.0.

Best wishes, Aida

jjtobin commented 10 months ago

Hi Aida,

Thanks for the report on your experiences and the potential fix! Are you able to share the MS you were running and the cont.dat so that we could also have a look at what was going on?

John

aida-ahmadi commented 10 months ago

Hi John,

Unfortunately the data I tested it on is in the proprietary period but I will try to upload it on a drive and send you something by next week via email.

Thanks, Aida

jjtobin commented 9 months ago

Hi Aida,

After looking at your change, I think what you have come up with is not necessarily a general solution, but it worked for your data. We make no assumption about the starting index of spectral windows, we normally test with data directly from the pipeline, which never has spectral windows starting at zero and they are also non-sequential. A fundamental assumption that we rely on is that the spw numbers in the cont.dat will match up with one of the EBs from the observations. Data that come directly from the pipeline will adhere to this assumption, so if there are data you're trying to selfcal that violate this assumption, it's often easier to adjust the data than adjust the code for all possible variations of the metadata.

The issue you encountered was because ESO seems to be concatenating all the individual MSes, which reindexes the spectral window numbers and makes them no longer align with the cont.dat produced by the pipeline. So the change you made only works when the spectral windows start at zero (since you're using the array index to get the spwnames using the msmd tool). If the spwnumbers started at a different number, your change would encounter an error.

Fundamentally, this issue arose because of the manipulation ESO did to the data. So in this case I think the more straightforward approach would have been to edit the cont.dat to match the spws in the MSes because then your MSes and cont.dat would be consistent with each other and not require a bespoke code change, and one that will only work in this instance with spws starting at zero.

Something else we looked at is making it possible for the selfcal routine to run findcont, to not rely on the pipeline cont.dat match the data someone has. That is probably straightforward so long as the selfcal is being run with a pipeline-version of CASA; Patrick and I discussed this and it's probably easier to call the pipeline routines within our program than to duplicate the machinery to run findcont (and to keep Todd's findContinuum.py as part of our repo).

jjtobin / auto_selfcal

Indexing issue #28