LSSTDESC / DC2-production

Configuration, production, validation specifications and tools for the DC2 Data Set.
BSD 3-Clause "New" or "Revised" License
11 stars 7 forks source link

DC2 Run2.2i DR2 processing at CC #402

Closed johannct closed 3 years ago

johannct commented 3 years ago

This production used the same list of visits as the DR2 production at NERSC, and in particular uses the opsim flag to populate the DDF region with only the visits of the WFD cadence.

The workdir at CC is /sps/lssttest/dataproducts/desc/DC2/Run2.2i/v19.0.0-v1/ and the data repository is rerun/run2.2i-coadd-wfd-dr2-v1 under this workdir, plus the upstream data repositories chained as usual by the pipeline.

The parquet files are under dpdd/run2.2i-wfd-dr2. Many patches are missing in the object and metacal catalogs, due to the shallowness of DR2; several jobs actually fail in metacal for this reason (as far as I could check):

object catalog:

johannct commented 3 years ago

CPU usage and plots :

johannct commented 3 years ago

Memory usage

heather999 commented 3 years ago

@johannct It sounds like I should request transfer of this processing to NERSC. I'll add a comment to the ongoing thread on the IPP repo with Fabio.

jchiang87 commented 3 years ago

@johannct Do you have the data that went into those plots available somewhere? I expect that having used the pipe drivers means that we can't get info at the pipe task level, but it still might be useful to look at the pipe driver numbers, especially if each entry on those histograms can be associated with specific filter-tract-patch combinations.

johannct commented 3 years ago

@jchiang87 what I used is extremely coarse : data_from_log.bash.txt is just running over logfiles and extract basic information. I need to clean the resulting outputs as some logs are anomalous for several reasons, and ideally all this should be run after the driver execution and after some sanity check. Then I just load the output and make some plots. I will send the notebook by email. There are monitoring outputs in each log directory, put in place by Bastien Gounon, but it only shows one process it seems, so it it not very useful. Finer grained analyses would have to partition a given log per pid..... Doable but unwieldy. On the other hand my output does provide tract/filter/patch info and the path to the log file. I'll provide the cleaned outputs as well by email

heather999 commented 3 years ago

@johannct as noted on Slack There appears to be 4 tracts missing: 3641, 4035, 4226, 4852 I don't see them in run2.2i-coadd-wfd-dr2-v1/deepCoadd-results/merged or in the associated u and grizy areas either. Any idea what happened to those tracts?

johannct commented 3 years ago

this is now done.