desihub / fiberassign

Fiber assignment code for DESI
BSD 3-Clause "New" or "Revised" License
7 stars 8 forks source link

review, simplify, and finalize fiberassign output format #271

Open sbailey opened 4 years ago

sbailey commented 4 years ago

Related to dangling PR #254: review the fiberassign output formats, simplify/trim them, and finalize it on something we are happy to use for years to come. A non-exhaustive list of items to check:

Finalize this before restarting observations in Fall 2020.

tskisner commented 3 years ago

Just had an adhoc conversation with @forero and @geordie666 on a cancelled zoom call... Testing at KPNO on a couple hundred tiles showed:

  1. Out of a 1.5 hour run time, one hour was spent on the merging step
  2. Merging all the input columns ran out of memory.

Here is a proposition: can fiberassign just write out the minimal file needed for ICS and then we can do the merging later at NERSC as a convenience step? We could add some extra header keys with checksums of the input target files, to ensure that post-facto merging is using the same target files. I think the minimal file format would include:

The result would be that the merged files would still be needed for fancy plots or any QA that needed additional properties of available targets, but actually running the assignment in operations would be fast and light.

forero commented 3 years ago

From [desi-data 5128]:

Suggested fiberassign file columns (minimally needed for ops, plus a few more for future developments, keeping more columns for the assigned targets than the potential targets):

In the FIBERASSIGN HDU (5000 targets that were actually assigned):

FIBER TARGETID LOCATION FIBERSTATUS LAMBDA_REF PETAL_LOC TARGET_RA TARGET_DEC FA_TARGET FA_TYPE FIBERASSIGN_X FIBERASSIGN_Y DEVICE_LOC OBJTYPE CMX_TARGET DESI_TARGET FLUX_G FLUX_R FLUX_Z PHOTSYS BGS_TARGET, MWS_TARGET, SCND_TARGET FIBERTOTFLUX_G, FIBERTOTFLUX_R, FIBERTOTFLUX_Z MORPHTYPE SERSIC,SHAPE_R,SHAPE_E1,SHAPE_E2 PARALLAX, PMRA, PMDEC, REF_EPOCH EBV NUMTARGET PRIORITY, SUBPRIORITY, OBSCONDITIONS, NUMOBS_MORE PRIORITY_INIT, NUMOBS_INIT FLUX_IVAR_G, FLUX_IVAR_R, FLUX_IVAR_Z FIBERFLUX_G, FIBERFLUX_R, FIBERFLUX_Z FLUX_W1, FLUX_W2 REF_ID, REF_CAT GAIA_PHOT_G_MEAN_MAG, GAIA_PHOT_BP_MEAN_MAG, GAIA_PHOT_BR_MEAN_MAG TIMESTAMP, VERSION, TARGET_STATE (from ledger)

In the TARGETS HDU (anything that was covered by a fiber, even if it wasn't assigned) — a much smaller set of columns because there are so many rows:

TARGETID DESI_TARGET CMX_TARGET SV1_TARGET RA DEC FA_TARGET FA_TYPE PRIORITY SUBPRIORITY OBSCONDITIONS

note that the *_TARGET bit columns are the only ones that are not currently in the minimalist fba_run output, and when we hit main survey we could drop CMX_TARGET and SV1_TARGET.

tskisner commented 3 years ago

I think this has been resolved now that the target columns were pruned in the output? If so we should close this.