LSSTDESC / ComputingInfrastructure

Gathering place for CI - Computing and Infrastructure - issues
3 stars 1 forks source link

Set up a gen3 repo at NERSC with DC2 data for analysis WGs #62

Open jchiang87 opened 2 years ago

jchiang87 commented 2 years ago

Several working groups, e.g., TD, LSS, etc., would like to start using the Gen3 software on DC2 for various image processing work. We should set up a new Gen3 repo, using the same calibration data products that the project is using for DP0.2 and ingest the DC2 data that these groups will need.

Most groups will want to have the visit level processing outputs already available, so we should run the Gen3 pipeline to that point for the data they need.

Requests for specific DC2 datasets, e.g., requests for particular tracts covering some time period, can be made as comments to this issue.

ixkael commented 2 years ago

At the moment in LSS we mostly want to setup the infrastructure for using SSI (for example to model the selection function, test the impact catalog and sky cuts, and mitigate spurious clustering). So any subset of tracts would work, for testing purposes. Subsequently we would want to run SSI in a significant fraction of the DC2 footprint (in coadds). Ultimately we want to develop methods to model how much SSI we need, and where, in order to have the systematics under control for various science goals.

BrunoSanchez commented 2 years ago

For DIA processing I have used in Run2.2i tracts number 4430, 4431, 4432, 4638, 4639 and 4640. Any of those would be useful for Gen3 DIA re-processing of DC2, using y1-y5 images. Althoug, using y1 for templates, and at least y2 (or y3, y4, y5 data, whichever is the same) and a single tract, could yield great comparisons with Gen2 processing results.

jchiang87 commented 2 years ago

A short progress report:

I've set up repos using DP0.2 calibs and refcats, for Run2.2i here:

/global/cfs/cdirs/lsst/production/gen3/DC2/Run2.2i/repo

and for Run3.1i here:

/global/cfs/cdirs/lsst/production/gen3/DC2/Run3.1i/repo

These repos have all of the DC2 Y1-Y5 raw data ingested.

The visit-level processing for Y1 Run2.2i tract 4430 is running now, and I expect to have processing through coadds available this week for people to have a look at and perhaps start SSI studies. Next, I'll move to DIA processing for Y2 for 4430 using Y1 templates, to enable comparisons to the Gen2 results.

jchiang87 commented 2 years ago

The coadd processing for Run2.2i Y1 tract 4430 is finished and available at /global/cfs/cdirs/lsst/production/gen3/DC2/Run2.2i/repo. Here are the summed exposure time maps produced by healsparse: Run2 2i_Y1_4430_exptime_maps

jchiang87 commented 2 years ago

Coadd processing for Run2.2i Y1 data for tracts 4430-4432, 4638-4640 are available: Run2 2i_Y1_DIA_exptime_maps

jchiang87 commented 1 year ago

In the first half of 2022, I ran the visit-level processing of all of the Run3.1i data. Those can be accessed from the Run3.1i repo,

/global/cfs/cdirs/lsst/production/gen3/DC2/Run3.1i/repo

and the chained collection names can be found using the butler:

[cori04] butler query-collections /global/cfs/cdirs/lsst/production/gen3/DC2/Run3.1i/repo \*sfp_ddf_visits\* | grep CHAINED
u/descdm/sfp_ddf_visits_part_00                                   CHAINED    
u/descdm/sfp_ddf_visits_part_00_visit_tables                      CHAINED    
u/descdm/sfp_ddf_visits_part_01                                   CHAINED    
u/descdm/sfp_ddf_visits_part_01_visit_tables                      CHAINED    
u/descdm/sfp_ddf_visits_part_02                                   CHAINED    
u/descdm/sfp_ddf_visits_part_02_visit_tables                      CHAINED    
u/jchiang8/sfp_ddf_visits_part_03                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_03_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_04                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_04_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_05                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_05_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_06                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_06_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_07                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_07_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_08                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_08_visit_tables                    CHAINED    
u/jchiang8/sfp_ddf_visits_part_09                                 CHAINED    
u/jchiang8/sfp_ddf_visits_part_09_visit_tables                    CHAINED    

I also generated coadds at 2 year depth for patches 35, 36, 42, 43 in tract 4848 (these are the four full patches in the SW corner of the DDF). The warps and coadds for these data can be obtained from collections found with

[cori04] butler query-collections /global/cfs/cdirs/lsst/production/gen3/DC2/Run3.1i/repo \*coadds_ddf_y1-y2_4848\* | grep CHAINED
u/jchiang8/coadds_ddf_y1-y2_4848                                  CHAINED    
u/jchiang8/coadds_ddf_y1-y2_4848_assembleCoadd                    CHAINED