RadioAstronomySoftwareGroup / pyuvdata

A pythonic interface for radio astronomy interferometry data (uvfits, miriad, others)
https://pyuvdata.readthedocs.io/en/latest/index.html
BSD 2-Clause "Simplified" License
83 stars 26 forks source link

Potential aberration-induced error in calculation of uvw coordinates #929

Closed kartographer closed 3 years ago

kartographer commented 4 years ago

TLDR: The current methodology in pyuvdata for calculating uvw-coordinates appears to be only accurate to roughly a part in 10^4 (of order ~10 arcsec error on-sky in the worst case, ~1 arcsec error in the best case). The issue can be corrected with modifications to the phase method, which may require the addition of the python-novas module.

In the course of trying to develop multi-source handling for pyuvdata, I stumbled upon some inconsistencies with the values for the uvw-coordinates that were recorded by SMA versus that which pyuvdata calculated. These inconsistencies had been seen before when we were first working to import SMA/MIR data sets into pyuvdata (with the ever-present "The uvw_array does not match the expected values" warning making an appearance), although I suspected that some part of it could be due to implementation issues with the Mir class. After correcting a few bugs (e.g., I discovered a discrepancy in the use of TT vs UTC time for SMA), I was still seeing errors to the tune several millimeters in uvw-coordinates when considering an array whose spatial extent is of order 100 meters.

To check the baseline positions that pyuvdata (specifically the phase method within the UVData class) was producing, I used the python-novas package, which is a python implementation of the US Naval Observatory's NOVAS software library. SMA (along with many other observatories) use the NOVAS library for a number of astrometrical calculations, which has microarcsecond-level agreement with that from the SOFA astrometry library (which is what astropy is built upon). Given that the observatory is sensitive to positional errors of order 0.1 mm, I trust that SMA's implementation of the library is accurate to at least the 10s of milliarcseconds level. After a quick verification that python-novas produced results that agreed with that coming from SMA, I set about doing some testing.

To produce NOVAS-based estimates for the uvw-coordinates, I used the topo_star method to calculate the apparent (i.e., topocentric) RA and Dec of a source given a set of ICRS coordinates. From there, I used the local apparent sidereal time to compute an hour angle, which can be used with the phase_uvw method (along with the antenna positions) to calculate the baseline coordinates. This is a slightly different method than what's used in phase, which projects the antenna positions from ITRS into the requested frame (either GCRS or ICRS), and then calculates the the baseline coordinates using the RA and Dec of the source (again, using the phase_uvw method). To reduce the potential for LST-based errors affecting the results, I've used the astropy-calculated value for both methods. I've also disabled some of the more advanced features of NOVAS which allow for the correction of the position of the celestial pole due to polar wobble (a third-order term, after precession and nutation, that typically affects positions of sources at the 0.1 arcsecond level), as astropy does not seem to have a method for implementing this correction in anything but the AzAlt frame (although the underlying SOFA library does). I note here though that turning this on or off doesn't seem to have a significant impact on the test results below.

In a limited set of tests considering coordinates in the GCRS frame, the results between the two methodologies agree to better than a part in 10^6, which is a smaller that the discrepancy noted above, and of order the diurnal aberration term (~0.3 arcseconds). However, when I consider a set of coordinates in the ICRS frame, I get much larger errors, of order one part in 10^4. Below I've plotted a set of results using the sma_test.mir data set, which is only a single integration in length, allowing the RA of the phase center to vary but keeping the Dec fixed at 15 degrees.

baseline_diff

The above shows cm-scale differences between the two methods, which does match the level of discrepancy I was seeing earlier. In further testing, I found that the magnitude and direction of the error did have dependencies on baseline orientation, JD/LST of the observations, and source position, although generally speaking these errors were minimized when the RA of the source was around either 0h or 12h. I was curious to see this could be seen in other test data sets (including 1133866760.uvfits and atca_miriad), and I do indeed see the same kind of behavior that I found with the SMA data set. This includes using the observational parameters (MJD, source and antenna positions) from the MWA data set that was used for validation, as written in @bhazelton's phasing memo. If I've interpreted the memo correctly, the data in the memo were phased to the position RA=12h06m00s.0, Dec=-42d18m00.0s (close to the minimized RA noted above), which in simulation I found produced an apparent error of order 10 mm. That seems to match with the findings in the phasing memo that the "results show that the uvws in each comparison (drift and rephased) agree to better than 2 cm''.

Near as I can tell, the way that astropy is projecting the telescope positions into the ICRS frame isn't quite right, or at the very least isn't producing what one would need for the determination of baseline vectors. But more fundamentally, I'm not sure that moving that calculating the baseline positions in the ICRS frame is quite what one wants to do. The w-component of the baseline vector contains the geometric delay versus the projected co-planar array -- these delays are nominally corrected for in the observatory rest frame. I know that both MIRIAD (i.e., uvcal) and MIR make this assumption, and it does seem to line up with how w-corrections are made in phase and unphase. Additionally, while the fringe pattern measured on sky is affected by aberration, that "deformation" isn't quite correctly captured by the Lorentz contraction of the baseline in the ICRS frame, but instead is a bit more complicated (relating to the fact that the fringe is measured at a given frequency within the rest frame of the observatory). The distortion of the uv-position in the ICRS frame seems to only be an issue when dealing with baselines that are 10s or 100s of km long (depending on the size of the primary aperture), or longer if the velocity vector of the observatory in the ICRS frame isn't changing much, so it's a secondary issue relative to the w-projection issue noted above.

Assuming that I am correct (which might warrant further discussion), there is then the question of what to do to correct the error. Here is what I would propose:

  1. Change the phase method to calculate baselines using the apparent RA and Dec in the topocentric frame of the observatory. I think this is what's need for correct phasing of the data, and is also generally consistent with how the antenna positions are stored (in ECEF vs ENU). Astropy does not at this time seem to have clean way for calculating this, so this would tentatively require adding python-novas to the list of dependencies for pyuvdata.

  2. For the sake of consistency, similarly change the method used for LST calculations from astropy to python-novas.

  3. Preserve the current astropy-based phase method under a new name (old_phase?), and keep it as a utility function so that users can correctly unphase from the old phasing implementation.

david-macmahon commented 3 years ago

@kartographer that is an amazingly detailed write up! It just so happens I've recently been working on UVW calculations (using ERFA, which is essentially SOFA with a BSD license and some minor bug fixes) so this is especially timely for me. I've also been using the MWA_tools example from the UVH5 "phasing.pdf" memo. I end up with results that are several millimeters off from the UVW values from MWA_tools. Not sure if this is same difference you are seeing. One thing that's not clear to me is whether the MJD given is UT1 or UTC. That changes the results by a millimeter or two in each component, with treatment as UTC giving the result closer to the MWA_tools version, but still off in U and V by ~2 and ~5 mm, resp. (W is off by less than 0.5 mm).

The phasing memo states that the MWA uvfits files are phased to GCRS rather than ICRS. I'm certainly no expert in this, but looking at figure 2 of this (excellent, IMHO) SOFA document, it seems like aberration is one of the differences between ICRS and GCRS. What are the implications of phasing to one vs the other, especially wrt to UVW calculations?

It would be great to have a canonical "test set" of array locations, telescope positions, source positions, times, and resultant UVW baselines. The example in the phasing memo is a great start, but it seems to have some ambiguities about it so it would be great to have additional test cases from different observatories (probably with different ambiguities :)). Are you aware of any such collection? I guess one could make such a set with, for example, MIRIAD's uvgen utility, but using real-world tested/proven values would be most compelling.

I had never considered the effect of differential aberration across the field of view. I could be wrong, but I assume it would be dwarfed by the W projection error in the flat-sky approximation as one moves further away from the phase center.

P.S. Isn't LST so 20th century 🙄, having been replaced by Earth rotation angle? 😜

bhazelton commented 3 years ago

@david-macmahon We are planning an update to the phasing memo, which I wrote when I got the current version of phasing implemented in pyuvdata. It does slightly the wrong thing by taking the antenna positions to ICRS and then calculating the UVWs there, as Karto makes clear in his awesome write-up. I'm hoping to make it clear in the memo what the right thing is and why the old way is wrong.

My conviction that the MWA_tools way went to GCRS and not ICRS is based on 2 things: one is cryptic comments in the code and as careful a read of the code as I was able to do that strongly suggest that aberration is not included; the other is that when I used my phasing code and went to GCRS I got much closer to the values from MWA_tools than when I went to ICRS. I will redo those calculations with Karto's new method soon and update the memo with that information.

I don't think that the set of MWA files I used for the memo were what we really need for this testing because they use slightly sketchy code and because neither is "unphased", so we have to do two steps in the comparison rather than one. They were just the best things I had access to at the time. I think Karto has some datasets that are much better for this test because they have been phased with code that is much more trustworthy in large part because it's been rigorously tested at higher frequencies where these effects are more important. But the more test files we have the better!

david-macmahon commented 3 years ago

Thanks, @bhazelton, that all sounds great. There certainly are a lot of subtle details to get right (as I'm still learning). Getting any of them wrong doesn't always result in obviously incorrect results (though for the record, passing arcseconds instead of radians for polar motion did! 😜) so it is definitely tricky to verify whether things are spot on. I'd be happy to contribute to the phasing memo update, especially wrt phasing examples and UVW calculations using external code (i.e. outside of pyuvdata/astropy/python).

aelanman commented 3 years ago

Hi @kartographer . Thank you for this great write-up! Sorry to be so late to the game.

You mention that you've "disabled some of the more advanced features of NOVAS which allow for the correction of the position of the celestial pole due to polar wobble" Is this referring to the polar motion parameters x_p and y_p giving the position of the CIP in the ITRS? Because those are included in astropy's CIRS to/from ITRS transformations.

Some recent work I've been doing may be of interest here. I've been comparing geometric delays computed in different ways with astropy, and then comparing them to CALC11, a VLBI delay modeling tool. With astropy, I can compute a geometric delay either by transforming the antenna positions to the ICRS frame or by transforming the source position to ITRS (in either frame, difference the position vectors and dot with the source direction). There is a difference in delays computed in each frame of order a microsecond.

I haven't been able to find much documentation beyond the code comments, but by my reading CALC11 transforms antenna positions into GCRS (or whatever is meant by "J2000 geocentric") to compute delays and UVW coordinates. It does this with the "Consensus" relativistic model, which accounts for gravitational deflections of all the planets and the Sun and Moon, and gravitational time dilation of the Earth. It also accounts for atmospheric propagation effects and tidal shifts in the Earth's crust. If I disable atmospheric corrections in CALC11, it differs from the ITRS astropy calculation by around 10 ns.

kartographer commented 3 years ago

Hey @aelanman! Apologies for the slow reply -- still working on finalizing to the PR for this, so I think it's safe to say you're not late ;)

I realize that reading through my original post that the comment about x_p and y_p was not well explained, but the short version of the story was that in testing, I found that there was some inconsistencies in when the correction of the polar wobble was being applied, at least based on my understanding of the underlying SOFA library (I just went through a recent bout of this w/ TEME and TETE frames). There may have been an implementation issue with the way that astropy was applying the corrections, or it's entirely possible that there was a conceptual error on my side in comparing the SOFA/astropy and NOVAS results. Nevertheless, the magnitude of the correction is small (typically tenths of an arcsec), much too small to explain the apparent discrepancy. So I figured I would leave it uncorrected on the NOVAS side, and make a note of it.

I don't know well enough the details of how CALC11 works to speak intelligibly about it (but when has that stopped me before?). I think that makes sense, based on my limited understanding of how fringe tracking works in VLBI. But VLBI has a lot of weird and strange caveats to it (some of which is discussed in TMS V3 Section 9.3.2), so it's more difficult to draw comparisons with given all of the higher-order effects that VLBI brings to the table. For what it's worth, I think both NOVAS and SOFA have a similar accounting of gravitational deflection by solar system objects, although I'm not sure if there is any accounting for tidal displacements or continental drifts (since for most observers, these effects are trivial/common-mode). I don't know if that explains the 10 ns difference in delay (which is pretty sizable, at 3 meters of path length), although I'd be willing to bet that the differential difference between the CALC11 and astropy for a "normal" array extent is probably a bit smaller. The difference of 1 us that you found between comparing the two different projection schemes is a bit concerning (since that's 300 meters of path length), although I'd need to understand a bit more the details of the calculation in order to know if there's an actual issue or not.

Back to the original manner at hand, you're not wrong in saying/implying that one can use either scheme -- in principle, the two are simply different reference frames, neither of which are invalid, but simply carry their own set of assumptions. But there are two reasons why I have concerns about they way it is currently implemented. The first is relatively straightforward: I'm not 100% certain that astropy is handling the 3D/vector geometry of the baselines correctly (or at least, in the way that one needs for an interferometer). My lack of certainty mostly stems from my own inability to get it to work "correctly" based on my own by-hand calculations. It's entirely possible (probable?) that I've done something silly in these tests, but I haven't yet seen evidence that astropy is doing this to picosecond/millimeter-scale accuracy (which is what SMA needs), which is disconcerting. The second, and arguably more fundamental issue, is that the attribute freq_array appears to be tied to the sky frequencies for the individual channels, which are tied to the topocentric/observer reference frame. That doesn't play well with having the baseline vectors be in a non-topocentric rest frame. You could fix this issue without changing the baseline projection methods by, for example, adding velocity/reference frame information (e.g., the AIPS FO table in the UVFITS file format), but pyuvdata doesn't have a place for storing such information... yet (see #901). One could also redefine freq_array to be in the appropriate celestial frame, although that would currently be incompatible with interferometers which don't do any sort of Doppler frequency tracking (which I believe include HERA, PAPER, and MWA).

aelanman commented 3 years ago

Hi @kartographer . I've done a bit more more review of CALC11 since I last commented. The "Consensus" formula includes the full Lorentz transformation of the baseline into a solar system barycenter (SSB) frame, so although the antenna positions are given to it in ITRS, the formula is operating in SSB.

I think the μs-scale deviation I'm seeing is due to the motion of the baseline during the geometric delay time. For instance, it takes a few ms for light to traverse a 9000 km baseline, in which time the Earth moves about 900 m, potentially causing a difference of around 3 μs in total delay. This baseline motion is only an issue when computing in the barycentric frame. In contrast, doing the calculation in the GCRS frame keeps the baseline stationary but implicitly assumes that the source doesn't move over the few millisecond (more precisely -- we can no longer assume the planar wavefront stays planar when transformed to geocentric coords). It may just be that the error of assuming a stationary source in a geocentric frame is smaller than the error of assuming a stationary baseline in a barycentric frame, even though neither is technically correct.

Here's a script that computes the geometric delay with three different methods in astropy on a long baseline.

For what it's worth, I think both NOVAS and SOFA have a similar accounting of gravitational deflection by solar system objects, although I'm not sure if there is any accounting for tidal displacements or continental drifts (since for most observers, these effects are trivial/common-mode)

Probably not. CALC needs to have access to additional data specific to each site in order to apply tidal displacements, and I don't see evidence of astropy using such data. From some experimenting, I think it only amounts to a few picoseconds difference on very long baselines. This makes sense since CALC is designed for picosecond accuracy in VLBI.

The second, and arguably more fundamental issue, is that the attribute freq_array appears to be tied to the sky frequencies for the individual channels, which are tied to the topocentric/observer reference frame. [...] One could also redefine freq_array to be in the appropriate celestial frame, although that would currently be incompatible with interferometers which don't do any sort of Doppler frequency tracking.

Since pyuvdata began as a way of translating between file formats, the convention was set based on what MIRIAD and uvfits were doing. I don't know anything about Doppler tracking beyond what I just read on this page, though. It sounds like SMA data are stored in MIRIAD with the celestial frequencies? Maybe reference frame info can be added as another attribute on the UVParameter for freq_array?

There has apparently been some recent work on improving astropy's agreement with SOFA. Astropy is also making steps toward an InterferometricVisibilityFrame for handling uvw calculation, tested against CALC.

kartographer commented 3 years ago

Thanks for the added info, @aelanman! And apologies (again) for the semi-belated reply -- was pushing to put together the PR that addresses this issue (#979), and so I've been trying to get that done before doing anything else.

I think the μs-scale deviation I'm seeing is due to the motion of the baseline during the geometric delay time. For instance, it takes a few ms for light to traverse a 9000 km baseline, in which time the Earth moves about 900 m, potentially causing a difference of around 3 μs in total delay.

I think you're correct that the ~1 µs delay is indeed arising from so-called "time of flight" effects (the subject of which is actually discussed in that TMS section that I linked to) -- I wasn't certain since I had thought that maybe you were pulling from one of the test data sets from PAPER, MWA, etc., which typically has baselines that are too short to see such effects. Of course, GBT to CHIME is certainly long enough ;)

This baseline motion is only an issue when computing in the barycentric frame. In contrast, doing the calculation in the GCRS frame keeps the baseline stationary but implicitly assumes that the source doesn't move over the few millisecond (more precisely -- we can no longer assume the planar wavefront stays planar when transformed to geocentric coords).

I don't think that's exactly right. In neither barycentric nor GCRS are the antennas considered stationary (although the magnitude of their motion is different by about two orders of magnitude). Of course, this might just be us talking past one another -- "the source moves" and "the telescopes move" can both be true, modulo a change of reference frame. But the plane wave should remain a plane wave, even if the telescopes are "moving" (unless you are in the near-field limit, which is a whole separate thing).

This is actually part of the motivation for looking at topocentric coordinates -- the telescopes are treated as stationary, and so you only have one set of calculations to run, for the source position itself. And because the antennas + correlator "live"/output data in the topocentric (i..e, observatory) rest frame, describing the coordinates in that frame best matches the actual description of what the interferometer is measuring. That's not to say that you can't accurately describe the data in another frame, but I think it is the most succinct way to describe the data without a whole lot of caveats/extra metadata.

Since pyuvdata began as a way of translating between file formats, the convention was set based on what MIRIAD and uvfits were doing. I don't know anything about Doppler tracking beyond what I just read on this page, though. It sounds like SMA data are stored in MIRIAD with the celestial frequencies? Maybe reference frame info can be added as another attribute on the UVParameter for freq_array?

We could add it, and indeed at some point I do want to add it (it's stored separately from the freq_array equivalent in MIRIAD, UVFITS, and SMA's MIR format). But I think in adding it you would want to add handling for Doppler tracking information more generally (again, akin to the AIPS FO table in UVFITS), and doing that correctly would be a relatively sizable job, not something that I'd want to add to an already large PR.

There has apparently been some recent work on improving astropy's agreement with SOFA. Astropy is also making steps toward an InterferometricVisibilityFrame for handling uvw calculation, tested against CALC.

I wasn't aware of that before (although I definitely recognize more than a few names on the PR). It does look interesting, although I think I'd have to understand it a bit more "under the hood" to see what it's doing before I'd jump into using it. This admittedly might be in part due to my own battles with astropy, although it's also probably in part because of my unfamiliarity with the depths of how the CALC11 and DiFX correlators work in practice.

Of course, I say this as just a random contributor of code, and so the above just reflects my opinion. I put into PR #979 what I think is the most sensible method for calculating uvw-coordinates, although in principle, it'd be pretty straightforward to swap in an astropy-based replacement for the function uvw_calc in utils.py.

kartographer commented 3 years ago

And @david-macmahon -- now a very overdue response to your questions!

I've also been using the MWA_tools example from the UVH5 "phasing.pdf" memo. I end up with results that are several millimeters off from the UVW values from MWA_tools. Not sure if this is same difference you are seeing. One thing that's not clear to me is whether the MJD given is UT1 or UTC. That changes the results by a millimeter or two in each component, with treatment as UTC giving the result closer to the MWA_tools version, but still off in U and V by ~2 and ~5 mm, resp. (W is off by less than 0.5 mm).

So I don't know what MWA_tools is using -- @bhazelton can (and obviously above, has) speak to that a lot better than I can. My note about the MWA was mostly that the magnitude of the difference appeared similar, although it's entirely possible this was coincidental or even unrelated to what I saw w/ SMA. But I did test UT1 vs UTC, as part of a larger debugging that had to be done with SMA (which is how I learned that we record time in the dreaded TT time... blech). The nice thing about a time error is that, to first order, it shows up as a z-axis rotation, which very easy to see if you set the Dec of your source to 0 (but not what I seemed to find during my tests).

The phasing memo states that the MWA uvfits files are phased to GCRS rather than ICRS. I'm certainly no expert in this, but looking at figure 2 of this (excellent, IMHO) SOFA document, it seems like aberration is one of the differences between ICRS and GCRS. What are the implications of phasing to one vs the other, especially wrt to UVW calculations?

Aberration -- specifically, annual aberration -- is the primary difference between those two frames, which of course results in a Lorentz shift along the great circle that's defined by the source position, the direction of motion, and the "anti-direction". Over the course of a normal observation, this shift is relatively constant, and so from the phasing perspective this should look just like a static shift of about half an arcmin in the sky.

It would be great to have a canonical "test set" of array locations, telescope positions, source positions, times, and resultant UVW baselines. The example in the phasing memo is a great start, but it seems to have some ambiguities about it so it would be great to have additional test cases from different observatories (probably with different ambiguities :)). Are you aware of any such collection? I guess one could make such a set with, for example, MIRIAD's uvgen utility, but using real-world tested/proven values would be most compelling.

That would definitely be nice -- there is something of an informal cache of this data in the pyuvdata/data directory of the repository that we could (should) test against. From these, I do know that the above methods work well both SMA data, and they similarly seem to agree well with the baseline vectors calculated for one of the HERA-related files (zen.2458661.23480.HH.uvh5), as well as one other MWA file that I can't immediately find. I don't think there's a uvgen equivalent in pyuvdata, although I think (?) there might be one in pyradiosky. Someone who actually knows about this should probably speak more to this (@bhazelton and @mkolopanis are on the last commit I see in Github), since it'd be a stretch to call this an educated guess on my part.

I had never considered the effect of differential aberration across the field of view. I could be wrong, but I assume it would be dwarfed by the W projection error in the flat-sky approximation as one moves further away from the phase center.

It is indeed pretty small. To be fair though, w-projection effects are coupled in part to the size of the field of view being imaged, whereas the stretching from differential aberration has no such dependence, so they have domains that don't totally overlap. This stretching is equivalent to one part in 10^4, so it'd only really start making a difference on baselines that were tens of kilometers apart (assuming an antenna diameter of a few meters). There is similarly a potential "twisting" that's induced, since the celestial poles will shift due to aberration -- again, this would only really matter if you were within a degree or so of the pole. Both in that "probably-but-just-maybe-not" domain of being negligible.

P.S. Isn't LST so 20th century 🙄, having been replaced by Earth rotation angle? 😜

If I had my way, I'd just stop this whole "Earth rotation" business -- I like for my baselines to remain stationary in the uv-plane whenever possible ;)

kartographer commented 3 years ago

One additional thing I stumbled across last night that's perhaps worth noting here -- in test_uvdata, there is a test called test_phase_unphase_hera_antpos which calculates baseline vectors in two separate ways: the first rotates the baseline vectors themselves from ENU, and the second rotates the antenna positions and then calculates baseline vectors from those new antenna positions. In the old/existing version of the code, these two methods only agree to about 1mm for a HERA data set (zen.2458661.23480.HH.uvh5, max baseline length of 40 meters). And in the code (line 810 in test_uvdata.py) itself is this comment:

# The uvw's only agree to ~1mm. should they be better? assert np.allclose(uv_phase2.uvw_array, uv_phase.uvw_array, atol=1e-3)

Under the new method in PR #979 , the agreement between the two methods is at the double precision limit (i.e., setting atol=1e-13 still allows the test to pass, with a maximum absolute difference of 4.263256414560601e-14). If nothing else, it gives me some added reassurance that something wasn't quite right in the way that the old, astropy-based transform worked (that seems to be functioning correctly in the new code).

kartographer commented 3 years ago

Just as a brief follow-up to the original post, I decided to go back and evaluate the data set that was used in the original phasing memo. Long story short -- while still not perfect, the agreement between the pyuvdata-derived UVWs (using the branch from #979) appears to be much improved. I think I can explain some of the discrepancy, but there's still some minor inconsistency (at the ~0.2 arcsec level) between the two that I can't yet explain. Given that I have no idea how that file (1133866760.uvfits) was created, I figured I'd at least post a note here and let people poke at it if they desire.

new phasing.pdf

bhazelton commented 3 years ago

This is great, thank you @kartographer!! I have seen a comment in the phasing code that Cotter uses to the effect that it neglects aberration (thus the comparison in the old memo to GCRS). So I think that conclusion has merit.