Open schlafly opened 2 years ago
If we do this, I'm inclined to add it as a step in desi_assemble_fibermap, which is already combining information from fiberassign and coordinates files and dealing with things like missing columns and NaN.
i.e. calculate it on-the-fly when running a prod, using the latest-and-greatest turbulence code and focal plane model, but not "patching" raw data files nor maintaining a separate independent product of "what we wish the raw data files had in the first place".
In discussions on the survey-ops call on 2022/8/15, we decided we'd rather not compute on the fly and instead produce an alternative set of coordinates file checked into the 'tiles' repository. The pipeline would first look for a patched coordinates file in that product, and then fall back on the ordinary coordinates file.
Sorting out some specifics to know what to implement in desispec assemble_fibermap.
Currently the tiles directory structure is grouped by TILEID//1000, but these coordinates files are now per-exposure not per-tile. @schlafly how do you suggest we organize these? e.g. a new sub-directory structure grouped by NIGHT/EXPID mirroring the raw data? Or perhaps if these override coordinates files are rare enough, they could be kept in the same directory as the tile to which they apply.
In the raw data, the coordinates files are named coordinates-EXPID.fits
. When putting into the tiles svn product, it may be convenient to add the TILEID so that we can keep track of which file goes with which tile without having to read headers. e.g. coordinates-TILEID-EXPID.fits
. Objections or refinements?
@schlafly do you expect these override coordinates files to be fairly rare, e.g. for correcting the metrology-induced problems 20220507 - 20220513, or might this grow to commonly (always?) having an override file with the latest and greatest turbulence corrections?
Please itemize which columns should never change because they were used by actual operations, vs. which columns are eligible for updates to be used by the downstream pipeline. For the equivalent override process with fiberassign files, desispec assemble_fibermap includes a check on this to make sure we're not accidentally overriding something that we shouldn't.
My two cents, with the proviso that this is not a near-term high-priority item for me at present (i.e., I don't see this as need for Himalayas or Iron):
The columns I would consider touching are the following:
All other columns would be untouched. It's not completely clear to me which of these you would be okay with my touching. D{X/Y}_0 are what is used for the correction move. D{X/Y}_1 would be used for a 'second correction move' that never occurs. If I can't touch anything that ever gets sent to a positioner, I would not touch any of the _0 quantities but would touch all of the _1 quantities. The FIBER_ quantities I think are just derived from the D{X/Y}_{last move} and I would touch them in any case. My understanding is that the pipeline only used the FIBER_{...} quantities, though I don't know that. In that case I could imagine changing only the FIBER{DX, DY, OFFSET} quantities and leaving all the others. That would feel inconsistent, but that's kind of fine in this case.
If we pursue the approach where we make patched files, I would patch the files only in cases where we know the turbulence corrections are substantially compromised. e.g., I would only patch 20220507-20220513 and no other coordinates files. I'm sure we can imagine going back and getting better estimates for all files, but the improvements are going to be much smaller for all other cases, and I don't think it makes sense to bother patching those.
The fixed, stationary locations of the stuck positioners were incorrect for the nights of 20220507-20220513. We could repeat the turbulence corrections for these nights using the now known positions. We would need to think about how we want to incorporate that information into the pipeline to propagate it into the final data. The current original source is the coordinates files in data/YEARMMDD/EXPID.
A similar mechanism could be applied to all pre-turbulence correction data.
Only a few fibers per exposure move by more than 30 microns, though there are a few tens with corrections more than 20 microns.