LSSTDESC / ImageProcessingPipelines

Alert Production and Data Release image processing pipelines using the LSST Stack
BSD 3-Clause "New" or "Revised" License
3 stars 2 forks source link

Preparation for 2.1i processing at CC #81

Closed johannct closed 5 years ago

johannct commented 5 years ago

Following a discussion with @airnandez and @boutigny we need to move with several improvements :

  1. Short term : need to transfer to CC bias, dark, and BF kernel files (need to learn how to use these), and the full visits needed for skyflats. Need to confirm whether domeflats will be processed.
  2. ingestDriver should run locally on batch disk, via TMP_DIR, and move the registry db to OUT_DIR at the end of the job. The setup script needs to be modified.
  3. ingestDriver parallelism issues would be alleviated with the postgres output choice; we need to check what the status is of this
  4. calexps for each visits can be transferred to NERSC as soon as the visit is completed by singleFrameDriver. Need to study that this is practical and worth it.
  5. multiBandDriver needs a revamp : createSubstream based on tract, and then within such a substream dispatch more streams based on patches. This will allow to do postprocessing on tract and trigger transfer of tract to NERSC once completed (it is assumed that transferring patches once completed is not worth it). Furthermore, tract level postprocessing will allow for part if not all of the dpdd catalog generation process to be executed immediately.
  6. We probably need to plan for some time to check SkyBackground execution
boutigny commented 5 years ago

I am reactivating this issue as we are getting now very late to start Run2.1 processing at CC-IN2P3. So I will try to summarize where we are and what needs to be implemented. Please feel free to correct is something is wrong or inaccurate:

wmwv commented 5 years ago

getting now very late to start Run1.2 processing

I presume you mean processing of Run 2.1i

boutigny commented 5 years ago

Yes. Thanks for pointing this out. I corrected it.

boutigny commented 5 years ago

I find 490 complete visits (189 CCD) in /global/cscratch1/sd/desc/DC2/Run2.0i/Run2.1i/run201812/workpath/run/outputs, ignoring 1 which is tagged as "corrupted" so I guess that we can transfer those. The list is in /global/homes/b/boutignycompleteVisits.list

villarrealas commented 5 years ago

You should find a more complete list of files in /global/cscratch1/sd/desc/DC2/Run2.1i/run201812/run/outputs with a fixed directory structure. The "corrupted" subdirectories in some of these contain .fits files and checkpoint files that had mixed instance catalogs due to a workflow error, and can be safely ignored.

boutigny commented 5 years ago

Do you mean that all these visits are complete in the imsim production sense even if they don't have 189 CCDs ?

heather999 commented 5 years ago

@villarrealas just want to make sure I understand - there are also corrupted subdirectories in the newer '/global/cscratch1/sd/desc/DC2/Run2.1i/run201812/run/outputs` area.. These should definitely be ignored for the purposes of DM processing, I suspect.

villarrealas commented 5 years ago

@boutigny That is where we get a little trickier. We are currently complete in production for this version of imSim. Some visits may be missing a handful of detectors due to issues with the current imSim which would require a not insignificant server overhaul. I like to think of it is a big, inconvenient, single CCD covering cloud at the moment. However, as all of these are going to have the incorrect PSF handling, it may be that we'll come back to these straggling detectors if we have some extra compute time down the road. However, we have zero intent to further processing on these visits with the current version of imSim.

@heather999 The corrupted subdirectories should be ignored for purposes of DM processing.

katrinheitmann commented 5 years ago

Hi Dominique,

A few answers to your questions. First, @villarrealas will point you to what has been finished for now with the imSim version we started with in December. That data can be copied over. In the meantime, he is waiting for a new imSim version to continue Run 2.1i. Now more details about the items above:

So something is happening on everything, but I agree that we have to all coordinate better. Thanks fo the list!

boutigny commented 5 years ago

O, thanks @villarrealas , so I will prepare a list of files to transfer for Fabio based on the directories in /global/cscratch1/sd/desc/DC2/Run2.1i/run201812/run/outputs and I will assume that the mechanism to trigger the automatic file transfer tool will be in place for the next set of runs using the new imsim version.

boutigny commented 5 years ago

I find 377007 files in 2665 clean (not corrupted) visits. The file list is in /global/homes/b/boutigny/toTransfer_20190220.list. Is that suitable for you @airnandez ? We will have to recreate the same directory structure at CC-IN2P3 starting after /global/cscratch1/sd/desc/DC2/Run2.1i/run201812/run/outputs

airnandez commented 5 years ago

Transfer done. See details in this issue.

heather999 commented 5 years ago

@boutigny @johannct @airnandez There are now both calibration data as well as ingested calibration products (bias and darks) available for Run2.1i at NERSC. The calibration data was generated using lsst_sims w_2019_08 + imsims v0.4.1 + desc_sim_utils master. The calibration products were produced using lsst_distrib w_2019_09 and were ingested into the typical CALIB directory structure similar to what we used for Run1.2. The configurations used obs_lsst w_2019_09. You can transfer these tarballs over to IN2P3:

/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.1i/calibration_data/calibration_data_run2.1i_bias_darks.tar.gz
/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.1i/calibration_products/CALIB_run2.1i_bias_darks.tar.gz

Additional details about generation are available here: https://confluence.slac.stanford.edu/display/LSSTDESC/Run2.1i+calibration+data

johannct commented 5 years ago

@heather999 thanks! Did the ingestion include fixing the dates?

heather999 commented 5 years ago

@johannct No, I'm not aware of what needs to be done to fix the dates for ingestion. The dates for the biases and darks are the default: 2022-01-01 and is likely handled by Jim's script make_dark_frames.py which is used for imSim. This is a little different than generating the calibration data for phoSim. @jchiang87 did I miss something in the wiki or is there another bit that needs to be added to the directions?

jchiang87 commented 5 years ago

There's no need to adjust the dates in the calibRegistry.sqlite3 file for these data since the default mjd=59580 was used. It was only an issue for earlier phosim data when the instructions at the wiki (and phosim repo) didn't provide the mjd of the observation in the phosim instance catalog.

airnandez commented 5 years ago

@boutigny @johannct @airnandez You can transfer these tarballs over to IN2P3:

/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.1i/calibration_data/calibration_data_run2.1i_bias_darks.tar.gz
/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.1i/calibration_products/CALIB_run2.1i_bias_darks.tar.gz

Data is transferred to CC-IN2P3. See details here.

johannct commented 5 years ago

closing this issue as it is superseded by #88