desihub / fiberassign

Fiber assignment code for DESI
BSD 3-Clause "New" or "Revised" License
7 stars 8 forks source link

Count ETC fibers as stuck fibers. #425

Closed schlafly closed 2 years ago

schlafly commented 2 years ago

This PR additionally counts ETC fibers as stuck fibers in the stuck sky loop. I guess due to everyone else's foresight and good design, these then get updated FA_TYPE and FIBERSTATUS reflecting whether they are skies.

Running a single test tile with fba-main-onthefly.sh, I see that the resulting SKY_MONITOR extension has fibers marked with FA_TYPE = 4 (fiberassign.target.TARGET_TYPE_SKY) and FIBERSTATUS & 2**0 == 0 (FIBER_STATE_UNASSIGNED is not set).

I think I'm okay with this being the signal to the ETC to use or ignore these fibers for sky purposes. We could imagine making the interface more convenient (i.e., 0/1 good bad field, or a string, or something), but not having to change the fiberassign file format seems important enough that I don't think I would do anything else.

@dstndstn , @araichoor , can you tell me that I'm not crazy and this is plausibly all that is necessary on the fiberassign side? The only test I have run so far is that the code runs and that it outputs that all 20 ETC fibers are good sky fibers on tile 40000. I also tried to get some bad sky fibers. Designing tile 40512, which is directly on the Galactic center, ends up marking 6/20 ETC fibers as bad sky fibers. That's less than I might have expected, but okay?

schlafly commented 2 years ago

I did break reproducibility here---do we think we care that a new tile would have filled-in-SKY MONITOR fields, while a new tile wouldn't?

schlafly commented 2 years ago

Pinging @sbailey to weigh in on the importance of reproducibility in the SKY_MONITOR extension.

araichoor commented 2 years ago

I guess that as long as the science targets have the same assignment+potential assignment, it should be fine.

In particular, I want to make sure that the etc "stuck" sky don t count as assigned skies, because that would definitely break reproducibility, I think.. From reading the code, it looks like those "additional" stuck skies go into the assignment: https://github.com/desihub/fiberassign/blob/cb589265cdcf6c372bc2b2858e6f84b6df01dd3e/py/fiberassign/scripts/assign.py#L407-L416 I m not familiar with the C-code part, it can take me a bit of time..

I ve made a test, simply running:

fba_launch --outdir $CSCRATCH/tmpdir/fiberassign-branch --tileid 999999 --tilera 0 --tiledec 0 --survey main --program DARK --dtver 1.1.1

I confirm that only the FA_TYPE and FIBERSTATUS columns of the SKY_MONITOR extension change. For that run, the science targets assignment+potential assignment do not change. But I d like to be convinced that is always the case. e.g. maybe that those ETC sky fibers don't have any slitblock assigned, so the 1-fiber-per-slitblock requirement is blind to that.

araichoor commented 2 years ago

And a rather cosmetic comment: would it be possible to adapt the log report with splitting the ETC fibers?

e.g. with my test run above:

I guess something more explicit like INFO: 458 of 709 stuck (and 11 of 20 ETC) positioners land on good sky locations.

schlafly commented 2 years ago

Thanks Anand. I hadn't thought about accidentally having these cut into the true spectroscopic sky budget. I'll take a look at that too.

Dustin or Anand, do either of you know if there's an easy way for me to check if the positions are "right"? I compared the OBS_X from the DB dump with the FIBERASSIGN_X, and they agreed at the few micron level, so maybe that's enough? I don't really know the provenance of the OBS_X field and was just looking for a match.

araichoor commented 2 years ago

Sorry I m not familiar with the "micrometer-space", so I cannot help here...

schlafly commented 2 years ago

You're right that these all have slitblock == -1 and are ignored here: https://github.com/desihub/fiberassign/blob/babf129d3e1612d67036e38ab5f2a5d25204b709/src/assign.cpp#L86 But broadly it's a good point that I could be missing some subtleties of this kind. Is it easy for you to do a "rerun" verifying that in fact we never break the science extensions? I'm also happy to be a guinea pig if you want me to run the code that you usually run to do that.

araichoor commented 2 years ago

Thanks for checking the C-code!

I ve actually replayed the fiberassign for the 1862 dark tiles designed with DTVER=1.1.1, using this stuck-sky-etc branch, and I ve assessed the reproducibility of the TARGETID-FIBER values for the FIBERASSIGN and POTENTIAL_ASSIGNMENTS extensions.

All tiles are correctly reproduced, except one, TILEID=1723, where there is only one fiber difference:

raichoor@cori07:~> cat /global/cfs/cdirs/desi/users/raichoor/fiberassign-rerun-main/20220322-master-stuck-sky-etc/001/fiberassign-001723.diff
# TILEID    EXTENSION   STATUS  TARGETID    FIBER   TARGET_RA   TARGET_DEC  FA_TYPE PRIORITY_INIT   PRIORITY    DESI_TARGET MWS_TARGET  BGS_TARGET  SCND_TARGET
001723  FIBERASSIGN miss    616088670469882501  433 40.30952086298794   5.511171157928472   4   -1  -1  4294967296  0   0   0
001723  FIBERASSIGN add 39627918166462315   433 40.31191798275266   5.508592998840867   1   3200    2   65537   0   0   0

I m investigating this case, that looks a bit weird to me...

araichoor commented 2 years ago

About this TILEID=1723: it looks like this one-fiber difference is real (I did a manual rerun with this stuck-sky-etc branch and with the master branch, with both runs having the same environment otherwise).

In the stuck-sky-etc branch run, the (stuck) FIBER=142 is assigned to SKY, which enables FIBER=433 to observe a TGT target; in the master branch run, FIBER=433 is assigned to SKY.

Apparently this difference happens at the very first round of assignment:

From the original run:

INFO: Loaded focalplane for time stamp 2021-12-27 17:00:40+00:00
INFO: Focalplane has 741 fibers that are stuck or broken
...
INFO: 500 of 700 stuck positioners land on good sky locations
...
INFO: Tile 1723: Start: SCIENCE: 0, STANDARD: 0, SKY: 500

From the rerun from this branch:

INFO: Loaded focalplane for time stamp 2021-12-27 17:00:40+00:00
INFO: Focalplane has 741 fibers that are stuck or broken
...
INFO: 517 of 720 stuck positioners land on good sky locations
...
INFO: Tile 1723: Start: SCIENCE: 0, STANDARD: 0, SKY: 501

I still need to understand how the stuck-sky-etc branch changes could do that...

I could perform tomorrow a rerun of the bright tiles, to see if there are other discrepancies, which could help to understand.

araichoor commented 2 years ago

And I confirm that running with the stuck-sky-etc branch, but with commenting the line 52 here (ie the addition of the ETC fibers to the stuck skies): https://github.com/desihub/fiberassign/blob/817adc2751373de8b6ab3d852bfdbfb5ac1968e8/py/fiberassign/stucksky.py#L50-L56 removes the difference with the original run.

araichoor commented 2 years ago

I ve narrowed the issue: it s in the call to xy2radec in stucksky.py, here: https://github.com/desihub/fiberassign/blob/817adc2751373de8b6ab3d852bfdbfb5ac1968e8/py/fiberassign/stucksky.py#L96-L99 For this TILEID=1723, with adding a print for the FIBER=152, LOCATION=363, I get with the stuck-sky-etc branch:

iloc=29 , stuck_loc=363 , stuck_x=3.6730653177297925 , stuck_y=-331.8065028942378 , loc_ra=40.43114211898239 , loc_dec=4.854429821079556

loc=363 : good_sky=False

and if I comment in the stuck-sky-etc branch the appending of ETC fibers to stuck_loc (see previous message):

iloc=29 , stuck_loc=363 , stuck_x=3.6730653177297925 , stuck_y=-331.8065028942378 , loc_ra=40.4311421870552 , loc_dec=4.854429820430742

loc=363 : good_sky=True

The (stuck_x, stuck_y) are the same in both cases, but not the (loc_ra, loc_dec), changing the good_sky status. I let @schlafly or @dstndstn investigate further, as it then deals with desimeter...

dstndstn commented 2 years ago

Ooh, that's quite the edge case. The xy/radec transformations have a "correction" step that comes from zeroing out a residual, so if you transform a different set of RA,Decs, you will get a different transformation function. It looks like here, by adding an extra RA,Dec, you have very slightly tweaked the transformation function, and thereby changed whether a stuck fiber lands on sky or not. Wow.

ashleyjross commented 2 years ago

Do I understand correctly that this only affected one assignment out of >1800 tiles (so, something like a 1 out of 8 million assignment)?

Obviously, it was great to pinpoint the issue and if there is an easy fix that would be good. But, if we have to live with it and those numbers are correct, I think this appears to be infrequent enough that we could.

sbailey commented 2 years ago

I don't think reproducibility of the ETC sky fibers is critical, but in the long run it might be less work to put that if/then time check into the code (don't do the calculation prior to X) to avoid the churn of "why isn't this fiberassign file reproducible? oh yeah, it's that issue..."

Tests: have you tried running this on old tile locations that we know had an ETC sky fiber contaminated by a star that didn't get outlier-rejected by the ETC? i.e. confirming that this code solves the problem that we have?

schlafly commented 2 years ago

Do you know where the measured ETC sky fiber brightnesses live? I have never dug into that data.

I did generate a tile and verify that the good/bad markings of the ETC sky fibers made sense. https://www.legacysurvey.org/viewer/?ra=280.0194&dec=36.3328&layer=ls-dr9&zoom=12&desifoot=280.0168,36.3306&catalog=tmpr4a5q20p (blue = good, red = bad)

Retrospective work will also be a bit annoying since if I use the usual rerun mechanism I'll get the old, bad positions of the ETC fibers.

araichoor commented 2 years ago

From the sv1-era, I had this piece of code to query the ETC database: https://github.com/desihub/fiberassign/blob/c9ef922dae336860959ae8cf1d2c07b1b8eb99a3/bin/sv1-summary.py#L134-L147 Maybe you can work out something from that.

araichoor commented 2 years ago

@dstndstn : great you found the explanation, thanks! @ashleyjross : yes, 1 fiber out of ~1800 dark tiles... for completeness, I ve launched the ~1600 bright tiles, we ll see. @sbailey @schlafly : if including a cutoff date, there exists a file exactly for that purpose: https://github.com/desihub/fiberassign/blob/master/py/fiberassign/data/cutoff-dates.yaml

Also, following Dustin s answer, I wondered if it could happen that with the stuck-sky-etc branch the (ra, dec) of the stuck fibers could slightly move but without changing their good_sky status (not sure that we would care, but I thought I d check anyway) I just checked: the (ra, dec) of the stuck fibers are exactly the same in all cases, except for that one fiber.

dkirkby commented 2 years ago

If you are looking for some of the (rare) cases when the ETC fibers probably had significant stellar contamination in the past, see the exposures listed in https://github.com/desihub/desietc/issues/7

schlafly commented 2 years ago

@anand, I've updated this PR to only do the stuck sky stuff for the ETC fibers after 2022-03-30, which would give us one week to merge. Does that look right? Mind rerunning a few tiles to make sure I did this right?

@dkirkby , I can verify, for example, that tile 3072, the first one listed in https://github.com/desihub/desietc/issues/7 , when redesigned with a rundate after 2022-03-30 gets most of its ETC sky fibers marked as problematic (7/20 are "good"). The assignments look reasonable to me as well; see, e.g., here, where blue circles are good sky fibers and red are bad. https://www.legacysurvey.org/viewer-desi/?ra=294.2550&dec=64.8414&catalog=tmp4a7vn0gi# I do not however see in the etc-00106015.json file any information on the individual sky fibers, only an overall average, so I'm not immediately able to say whether in fact the good sky fibers have lower sky values than the bad sky fibers, as seen by the ETC. My impression browsing the database schema is that this information is not available there either. Is there a sharper test I could be doing?

araichoor commented 2 years ago

Thanks @schlafly, I ll give a look later today. For what is worth: I finished re-running with the stuck-sky-etc branch the 1628 main bright tiles designed with DTVER=1.1.1. I get two tiles (21411, 23842) where there is a one-fiber difference with running with the master branch. I ve not investigated much, but it looks like the same situation as for TILEID=1723, where one stuck fiber becomes good_sky in stuck-sky-etc, and then frees up one fiber for a science target.

araichoor commented 2 years ago

@schlafly: The last commit with the cutoff date looks fine to me. Just, could you also add the modification to this line: https://github.com/desihub/fiberassign/blob/d9ca0618325f5cb3a946046bcb4d4e133729cd82/py/fiberassign/scripts/assign.py#L496 (that s not used by fba_launch, but for the sake of consistency in the code)

Tying that to rundate sounds right to me (we always have to decide what date value to use in fiberassign..).

One remark: for fba-on-the-fly, we choose as the rundate the latest change of the focal plane; so it will actually be used the first time the focal plane is changed after 2022-03-30 (could be few days later); nothing to worry about, but I just wanted to remind that.

Besides, I made few simple fba_launch test calls with varying the rundate, and it behaves as expected.

Last remark: my github alias is @araichoor :)

schlafly commented 2 years ago

I've made sure that all stuck_on_sky(...) invocations now use the rundate argument if available. Sorry for calling the wrong Anand, @araichoor!

I don't know how to do better tests of the effectiveness of the sky identification unless someone knows how to get sky information for individual ETC fibers, rather than the average of all of them, which seems to be what the ETC json files and database telemetry report.

dkirkby commented 2 years ago

There is a code snippet to reconstruct the sky levels for individual fibers here.

schlafly commented 2 years ago

@dkirkby, can you also tell me how to tell which fiber is which? I know the sky good/bad status corresponding to a set of DEVICE_LOC/PETAL_LOC, and the corresponding ra/dec. I don't know how to interpret the SKY attributes in terms of fiber locations. If I just assume SKYCAM0 = the first 10 ETC fibers, sorted by location, and SKYCAM1 = the second 10 ETC fibers, sorted by location, this does mark the two clear outliers among the ETC fibers---but that's not very impressive given that it marks 13/20 fibers as problematic, and only two are obviously bad.

image

schlafly commented 2 years ago

@dkirkby, do you know how to map SKY.flux to DEVICE_LOC and PETAL_LOC?

dkirkby commented 2 years ago

After some digging, I have established the following map for SKYCAM0 flux values:

flux index FP loc
0 9461
1 ??
2 6461
3 8461 or 8501
4 8501 or 8461
5 461
6 3461
7 7461
8 2461
9 1501

The corresponding map for SKYCAM1 will need to be determined from on-sky data, hopefully using this PR to flag when a bright star lands on an ETC fiber.

For a CSV file containing this map, as well as other useful quantities (DEV_ID, X_FP, Y_FP, ...) see here.

schlafly commented 2 years ago

We plan a fiberassign tag today or tomorrow in order to get tiles tagged for observation of tertiary program 1. I'd propose that we merge this before then, given that it has passed reproducibility checks. I've just pushed a minor commit to move the start date for inclusion of sky fiber updates to April 21.

It's possible that the study of which sky fiber is which reveals some major issue, but honestly, this is a pretty simple change, and I worry that any such issue would affect all stuck sky fibers we have used since survey start.

araichoor commented 2 years ago

@schlafly, just one thing: could you change the etc_stuck date to e.g. next Monday? because setting it to Apr. 21 would mean to use the new tag in fiberassign-on-the-fly as soon as tomorrow. but before that, I d like to make sure that the various recently merged PRs do not break the main survey tiles reproducibility (I expect it should be ok, or with very marginal non-reproducibility as with this PR, but I really prefer to be on the safe side here). and I won t have time for that tomorrow.

schlafly commented 2 years ago

Thanks. I've pushed this to 4/28.

araichoor commented 2 years ago

perfect, merging now. thanks!

araichoor commented 2 years ago

perfect, merging now. thanks!