Closed weaverba137 closed 1 year ago
thanks for raising that point.
actually, one "issue" with the fiberassign files is that some columns were - or still are - bugged. those are not "critical" columns from an operations point of view, but columns with photometry information, which are propagated downstream in the spectro. products (coadd, redrock, etc) fibermap extension.
we did a first round of patching with fixing some columns (on Oct. 5 2021), so prior to fuji / guadalupe were generated. but some columns still remain bugged, and we are likely to soon do a second round of patching (hopefully before iron).
could it make sense to do the following:
fuji_date
);edr
folder in the tags/
folder, where we would copy, for the edr / fuji tiles only, the fiberassign files from an svn-checkout version at fuji_date
.that way, the column information in the fiberassign and in the spectro. products fibermap should be consistent (even if bugged).
and we could proceed similarly in the future for dr1
and later releases.
for sure, it would add some redundancy (as there would be a set of fiberassign files associated to each release), but it should be fine in term of disk space a typical fiberassign file is 5MB large:
This sounds reasonable. I think you should go ahead with a test.
thanks for the answer. I ll work on writing a script doing that.
for sanity/curiosity, I just checked: the currently oldest latest fuji fiberassign file in the svn directory dates from Oct. 5, 2021 (i.e. from the patching); so before the fuji launch (Jan. 24, 2022).
so there is no need for this edr / fuji round to revert the svn to an older version, right?
svn_dir = "/global/cfs/cdirs/desi/target/fiberassign/tiles/trunk"
d = Table.read("/global/cfs/cdirs/desi/spectro/redux/fuji/tiles-fuji.fits")
tileids = np.unique(d["TILEID"])
timestamps = np.zeros(len(tileids), dtype=object)
for i in range(len(tileids)):
tileid = tileids[i]
tileidpad = "{:06d}".format(tileid)
fn = os.path.join(svn_dir, tileidpad[:3], "fiberassign-{}.fits.gz".format(tileidpad))
tmpstr = subprocess.Popen("ls -l --full-time {}".format(fn), stdout=subprocess.PIPE, shell=True).communicate()[0].strip().decode("utf-8")
timestamps[i] = tmpstr.split()[5]
np.unique(timestamps)
returns:
array(['2020-12-18', '2020-12-19', '2021-01-01', '2021-01-02',
'2021-01-05', '2021-01-11', '2021-01-29', '2021-02-02',
'2021-02-04', '2021-02-23', '2021-03-04', '2021-03-16',
'2021-04-30', '2021-05-05', '2021-05-11', '2021-05-12',
'2021-05-13', '2021-10-05'], dtype=object)
and a comment (based on the on-going email Bug in lsdr9-photometry files thread discussion):
in the current edr release plan, we have two versions of the fiberassign files:
https://desidatamodel.readthedocs.io/en/latest/DESI_SPECTRO_DATA/NIGHT/EXPID/fiberassign-TILEID.html
https://desidatamodel.readthedocs.io/en/latest/DESI_TARGET/fiberassign/tiles/TILES_VERSION/TILEXX/fiberassign-TILEID.html
I confirm that the fuji ztiles*fits
files are based on the svn fiberassign files, i.e. the patched ones.
@akremin confirmed that from the fuji code/logs point-of-view, and I did check the files.
(there remain few inconsistencies, but that s another issue, not related to the patching).
those two fiberassign file versions have some columns differences (due to patching of some photometric columns). I m not sure if it s possible to not release the ones in the raw data directory, is it? if we have to release the two sets, then we may want to mention this discrepancy somewhere (I don t know what is the best place for that).
I think the datamodel for the raw data version of the fiberassign files is that place to mention that those were the files that were actually used for the observations, but then reference the other set as what is used for spectro pipeline production and that those include patches to correct values needed for analysis but that don't impact the original observations.
thanks for that suggestion.
another question, @sbailey : should the script I ll write to create such a fiberassign "tag" folder go in desispec? or elsewhere?
@araichoor if this is a one-off script used just for making this tag, but not for general usage in making future tags, then let's put it in git fiberassign/etc/ for the record.
actually I was thinking to make it general, for future releases. the two release-dependent arguments simply being:
In that case I think it should go into git fiberassign/bin, or maybe still git fiberassign/etc where we sometimes put "scripts to use occasionally but not as a standard part of using this package"; desispec/etc has several of those. Either way, in the fiberassign repo not desispec.
bringing back to this thread the discussion with @sbailey and @weaverba137 for better book-keeping:
in short:
for both fuji
and guadalupe
, I suggest to use for the tag the revision 1120 from Jan. 23, 2022 (https://desi.lbl.gov/trac/changeset/1120/data/tiles/trunk).
any comments are welcome!
in details: summarizing offline discussions:
fuji
. Similarly for guadalupe
. They could even be the same tag. It does not matter if it contains additional tiles.so I wrote a small script to recover all the revisions for the tiles of a given production; I ll submit a PR in fiberassign for that.
for fuji
and guadalupe
:
so we could pick for tagging any revision between Oct. 2021 and Oct. 2022, as the fuji
and guadalupe
fiberassign files were not changed during that time.
I suggest to use the revision 1120 from Jan. 23, 2022, i.e. just before the processing was launched.
@araichoor, this sounds fine to me. It's useful for database loading that the same tag will work for fuji
and guadalupe
.
Have we decided on a name for the tag?
@araichoor sounds good. Thanks for double checking all of this.
Let's use tag 0.5 (or any 0.N tag if you have a favorite number). Iron used tag 1.1 (1.0 also exists, but was superseded by 1.1 which covers all of the tiles used by final Iron).
Can I let you both decide the tag number and create it?
(btw, for correctness, I ve added a minor correction to my previous message, as TILEID=80715
fiberassign files were actually re-committed on Feb. 3, 2023 -- this is meaningless for that discussion).
For the record, I will take on the task of actually creating the tag. It probably won't happen today, but hopefully by Monday.
The final tag command was:
svn copy --revision 1120 -m "Tagging tiles/0.5 (fuji/guadalupe)." ${SVN_URL}/data/tiles/trunk ${SVN_URL}/data/tiles/tags/0.5
I think all the needs of this issue are satisfied. Please reopen if we've forgotten something.
When loading the redshift database (
desispec.database.redshift
), the fiberassign files are currently loaded from trunk. If trunk corresponds exactly to any tag that would have been made for edr/fuji or dr1, then that's fine. If not, we should identify which tag(s) to use, creating them if necessary. This issue was split out from #1819.