desihub / desitarget

DESI Targeting
BSD 3-Clause "New" or "Revised" License
18 stars 23 forks source link

WIP toward subpriority overrides #736

Closed sbailey closed 3 years ago

sbailey commented 3 years ago

This PR is work in progress towards providing optional SUBPRIORITY overrides to account for fiberassign accidentally resetting SUBPRIORITY for the first lunation of the main survey. This doesn't yet do everything we need, but it provides something tangible to discuss to get there.

What works

desitarget.subpriority.get_fiberassign_subpriority() takes an input list of fiberassign files and writes an output FITS table with TARGETID, SUBPRIORITY columns for all targets covered by those tiles (not just those assigned), while filtering on SURVEY (main) and FAPRGRM (dark, bright). The output files would be used for overriding subpriority when rerunning select_targets.

Additionally, py/desitarget/subpriority.py has a if __name__ == "__main__" that can be used to call get_fiberassign_subpriority from the command line. I didn't put this into bin/ because I didn't want to pollute desitarget/bin with a one-off script, but it is handy to call it from the command line with a glob and I wanted to record the code for making those override lists.

desitarget.subpriority.override_subpriority(targets, override) updates the SUBPRIORITY column of TARGETS in-place by matching on TARGETID in the override table. Both targets and override can have entries that aren't in the other table. This is put into a separate function so that bin/select_targets and bin/select_skies can use the same code.

desitarget.io.write_* functions updated to only set SUBPRIORITY for entries that aren't already set (i.e. >0). This is implemented such that SUBPRIORITY row i would get the same value regardless of whether the other rows are pre-set or not. To be tested.

TODO: when I ran on /global/cfs/cdirs/desi/spectro/data/202105//fiberassign*.fits.gz I found more main+dark targets than when running on /global/cfs/cdirs/desi/target/fiberassign/tiles/trunk/00[12] . Figure out why.

What doesn't work (yet)

I implemented a single select_targets --override-subpriority FILENAME option, but then post-facto realized that select_targets does both bright and dark in a single run and thus needs two separate override lists. I think this would move the call to override_subpriority to immediately before io.write_targets, using obscon to select which override list to use, since unfortunately the same TARGETID appears in already observed dark and bright tiles with different SUBPRIORITY (though I haven't checked if those are science targets or just skies).

select_skies --override-subpriority FILENAME is implemented, but I probably need to update get_fiberassign_subpriority() to split out science targets from skies to get the union of dark+bright skies covered by already observed tiles.

Question: were any secondary targets included in the initial target/mtl files? I wasn't sure what the SUBPRIORITY situation was for secondaries, so I didn't update bin/select_secondary (yet?). I did put a hook in io.write_secondary for not-resetting already set SUBPRIORITY; it's just not used yet.

Also: for the SUBPRIORITY that are set by io.write_*, I didn't change the random seed generation to be per-healpix, which I think we want to do but I wanted to keep that concept separate from the concept of providing external SUBPRIORITY override lists.

@geordie666 please take a look and let's chat.

coveralls commented 3 years ago

Coverage Status

Coverage decreased (-0.4%) to 59.087% when pulling d5f833c5e64a88adef4694d29a3c20542d8b6e50 on subpriority into 9f9d2a7ea2617e295c7b4b8453f442b39bcbaafd on master.

geordie666 commented 3 years ago

@sbailey: Certainly the secondary targets were available to be fiber-assigned and were available as MTL ledgers. I'm not sure whether @araichoor included them on the first tiles (although I don't see why he wouldn't have).

I guess the question is whether we care about the "standalone" secondary targets having consistent SUBPRIORITY values, though. We might just document what happened instead. Any secondary targets that match a primary will already be correctly updated if the primary targets are correctly updated.

araichoor commented 3 years ago

hi, reading this now. I confirm that stand-alone secondary targets also have their SUBPRIORITY overwritten. they flow through the same process as the primary targets: https://github.com/desihub/fiberassign/blob/31a89d10e4ff308be457d07dabcbb1189a42c228/bin/fba_launch#L194-L209

about that point: "TODO: when I ran on /global/cfs/cdirs/desi/spectro/data/202105//fiberassign.fits.gz I found more main+dark targets than when running on /global/cfs/cdirs/desi/target/fiberassign/tiles/trunk/00[12] . Figure out why." I think* it s because all the observed files have not been svn-committed yet (though they are in the svn folder at kpno). I noticed that this morning while looking at elg data, I found 58/144 tiles observed-but-not-committed. I pinged Eddie on this this morning, he should do the committing today.

sbailey commented 3 years ago

Getting closer, still evaluating, but more eyes are welcome on example outputs in /global/cscratch1/sd/sjbailey/desi/targets.

Subpriority override files subpriorities-*.fits are now split by dark, bright, and sky (shared by both dark and bright). These are generated by scanning all observed main survey fiberassign files in May 2021 with

python $DESITARGET/py/desitarget/subpriority.py -i /global/cfs/cdirs/desi/spectro/data/202105??/*/fiberassign*.fits.gz -o .

The subpriority override files are then used by select_targets and select_skies with

select_targets /global/cfs/cdirs/cosmo/data/legacysurvey/dr9/north/sweep/9.0 $CSCRATCH/desi/targets/override -s2 /global/cfs/cdirs/cosmo/data/legacysurvey/dr9/south/sweep/9.0 --nside 8 --healpixels 142 --numproc 4 --nosecondary --gaiasub --dark-subpriorities subpriorities-dark.fits --bright-subpriorities subpriorities-bright.fits &> targets-142.log

select_skies /global/cfs/cdirs/cosmo/data/legacysurvey/dr9/south $CSCRATCH/desi/targets/override -s2 /global/cfs/cdirs/cosmo/data/legacysurvey/dr9/north --nside 8 --healpixels 142 --numproc 32 --nskiespersqdeg 18000.0 --bands g,r,z --apertures 0.75 --sky-subpriorities subpriorities-sky.fits &> skies-142.log

I believe select_secondary also works, but I don't have a 1.0.0 reference example to run and haven't tried deriving it yet.

Example outputs without overrides are under /global/cscratch1/sd/sjbailey/desi/targets/.

Examples with overrides are under /global/cscratch1/sd/sjbailey/desi/override/.

At first glance all the same targets are there and SUBPRIORITY is the only thing that changed (and only for targets covered by tiles), but for some reason the order of the targets in the files is different.

araichoor commented 3 years ago

not sure if that s relevant here, but a side comment, in case: some of the TILEID you select may not have been observed.

if I reproduced correctly the lines of get_fiberassign_subpriorities() to select the fiberassign-TILEID.fits.gz files from /global/cfs/cdirs/desi/spectro/data/202105??//fiberassign.fits.gz, I obtain 169 tiles. in tiles.csv (updated May 24, 11:02am PST), there are 155 main tiles.

the 14 "extra" tiles are:

array([ 1398,  1399,  1402,  1403,  1411,  1414,  1646,  1730,  1736,
        1774,  1923, 20413, 20636, 20831])

I guess that will be sorted out when the observed fiberassign-TILEID.fits.gz will have been committed to svn.

araichoor commented 3 years ago

I m looking at the code: I m wondering if these two lines: https://github.com/desihub/desitarget/blob/d5f833c5e64a88adef4694d29a3c20542d8b6e50/py/desitarget/io.py#L616 https://github.com/desihub/desitarget/blob/d5f833c5e64a88adef4694d29a3c20542d8b6e50/py/desitarget/io.py#L1167 shouldn t instead be: ii = data["SUBPRIORITY"] == 0.0 as we want to update SUBPRIORITY for targets where it has not been overridden (if I read https://github.com/desihub/desitarget/blob/d5f833c5e64a88adef4694d29a3c20542d8b6e50/bin/select_targets#L231-L241 correctly, before running write_targets, only overrideen targets should have SUBPRIORITY > 0.0).

as a check I tried:

from desitarget.geomask import match
sjbdir = "/global/cscratch1/sd/sjbailey/desi/targets"
program = "dark"
d = fits.open(os.path.join(sjbdir, "subpriorities-{}.fits".format(program)))[1].data
for case in ["default", "override"]:
    fns = glob(os.path.join(sjbdir, case, "dr9/1.0.1.dev5024/targets/main/resolve", program, "*"))
    print("# {}".format(case))
    for fn in fns:
        t = fits.open(fn)[1].data
        iid, iit = match(d["TARGETID"], t["TARGETID"])
        n_common = len(iid)
        n_diff_tid = (d["TARGETID"][iid] != t["TARGETID"][iit]).sum()
        n_diff_subp = (d["SUBPRIORITY"][iid] != t["SUBPRIORITY"][iit]).sum()
        print("{}: n_common={} , n_diff_tid={} , n_diff_subp={}".format(os.path.basename(fn), n_common, n_diff_tid, n_diff_subp))

which prints:

# default
targets-dark-hp-143.fits: n_common=99153 , n_diff_tid=0 , n_diff_subp=99153
targets-dark-hp-142.fits: n_common=25001 , n_diff_tid=0 , n_diff_subp=25001
targets-dark-hp-164.fits: n_common=65177 , n_diff_tid=0 , n_diff_subp=65177
# override
targets-dark-hp-143.fits: n_common=99153 , n_diff_tid=0 , n_diff_subp=99152
targets-dark-hp-142.fits: n_common=25001 , n_diff_tid=0 , n_diff_subp=25001
targets-dark-hp-164.fits: n_common=65177 , n_diff_tid=0 , n_diff_subp=65175
targets-dark-hp-8154.fits: n_common=0 , n_diff_tid=0 , n_diff_subp=0

I d expect that in the override case, n_diff_subp=0 for each file.

araichoor commented 3 years ago

I forgot to also mention the line in write_secondary: https://github.com/desihub/desitarget/blob/d5f833c5e64a88adef4694d29a3c20542d8b6e50/py/desitarget/io.py#L973

araichoor commented 3 years ago

and a suggestion in case override_subpriority takes too long when processing the whole sky. if I m correct, targets and override should each contain no duplicate of TARGETID, right? if so, replacing it with:

from desitarget.geomask import match
def ar_override_subpriority(targets, override):
    ii_targ, ii_over = match(targets['TARGETID'], override['TARGETID'])
    if len(ii_targ) > 0:
        targets['SUBPRIORITY'][ii_targ] = override['TARGETID'][ii_over]
    return np.sort(ii_targ)

can speed up things.

for instance for one sky file on NERSC logging node:

from astropy.io import fits
import numpy as np
from time import time

sjbdir = "/global/cscratch1/sd/sjbailey/desi/targets"
override = fits.open(os.path.join(sjbdir, "subpriorities-sky.fits"))[1].data
targets = fits.open(os.path.join(sjbdir, "default", "dr9/1.0.1.dev5024/skies", "skies-hp-164.fits"))[1].data

start=time(); ii_sjb = override_subpriority(targets, override); print("{:.1f}s".format(time()-start))
start=time(); ii_ar = ar_override_subpriority(targets, override); print("{:.1f}s".format(time()-start))
print(len(ii_sjb), len(ii_ar), (ii_sjb != ii_ar).sum())

returns:

30.8s
2.5s
378257 378257 0

but of course, if doing that, one has to triple-check the changed indices are correct! as this approach is less explicit.

sbailey commented 3 years ago

[embarrassed face emoji] Thanks @araichoor for catching my sign error on subpriority overrides. I'll get that fixed up...

araichoor commented 3 years ago

great; if you re-generate the catalogs in the override/ folder, I could re-run my test.

sbailey commented 3 years ago

Closing; superseded by PR #740