LSSTDESC / ImageProcessingPipelines

Alert Production and Data Release image processing pipelines using the LSST Stack
BSD 3-Clause "New" or "Revised" License
3 stars 2 forks source link

Perform visit-level forced photometry based on coadd object positions #87

Closed wmwv closed 5 years ago

wmwv commented 5 years ago

I suggest that we run one more step at the end of the coadd pipeline:

Right now we run the forced photometry on the per-filter coadds -- this provides the measurements for the Object table.

The next step is to run the forced photometry on each individual visit image base don the defined objects from the coadds - these measurements are what will fill the DPDD ForcedSource table.

I suggest that is worth doing on the current latest Run 1.2p and Run 1.2i. It could be done at either IN2P3 or NERSC.

boutigny commented 5 years ago

I agree that we need to do this at some point but I would like to have DM's opinion (maybe @TallJimbo) on the status of the code. When I tried it on CFHT ~1 year ago it was not yet really usable.

johannct commented 5 years ago

Trying forcedPhotCcd.py . --rerun coadd-v4:forced-test --id visit=179996 filter=u^g^r^i^z^y based on the 1.2p setup (w39 etc...) I get crashes with the following message :

forcedPhotCcd FATAL: Failed on dataId=DataId(initialdata={'visit': 179996, 'filter': 'u', 'raftName': 'R01', 'detectorName': 'S10', 'detector': 3, 'tract': 5065}, tag=set()): FatalAlgorithmError: CModel forced measurement currently requires the measurement image to have the same Wcs as the reference catalog (this is a temporary limitation).
Traceback (most recent call last):
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/pipe_base/16.0-12-g726f8f3+6/python/lsst/pipe/base/cmdLineTask.py", line 388, in __call__
    result = self.runTask(task, dataRef, kwargs)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/pipe_base/16.0-12-g726f8f3+6/python/lsst/pipe/base/cmdLineTask.py", line 447, in runTask
    return task.runDataRef(dataRef, **kwargs)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_base/16.0-13-gd9b1b71+8/python/lsst/meas/base/forcedPhotImage.py", line 151, in runDataRef
    forcedPhotResult = self.run(measCat, exposure, refCat, refWcs, exposureId=exposureId)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_base/16.0-13-gd9b1b71+8/python/lsst/meas/base/forcedPhotImage.py", line 168, in run
    self.measurement.run(measCat, exposure, refCat, refWcs, exposureId=exposureId)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_base/16.0-13-gd9b1b71+8/python/lsst/meas/base/forcedMeasurement.py", line 355, in run
    beginOrder=beginOrder, endOrder=endOrder)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_base/16.0-13-gd9b1b71+8/python/lsst/meas/base/baseMeasurement.py", line 283, in callMeasure
    self.doMeasurement(plugin, measRecord, *args, **kwds)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_base/16.0-13-gd9b1b71+8/python/lsst/meas/base/baseMeasurement.py", line 302, in doMeasurement
    plugin.measure(measRecord, *args, **kwds)
  File "/cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/meas_modelfit/16.0-13-g4c33ca5+8/python/lsst/meas/modelfit/cmodel/cmodelContinued.py", line 105, in measure
    "CModel forced measurement currently requires the measurement image to have the same"
lsst.pex.exceptions.wrappers.FatalAlgorithmError: CModel forced measurement currently requires the measurement image to have the same Wcs as the reference catalog (this is a temporary limitation).
forcedPhotCcd WARN: Could not persist metadata for dataId=DataId(initialdata={'visit': 179996, 'filter': 'u', 'raftName': 'R01', 'detectorName': 'S10', 'detector': 3, 'tract': 5065}, tag=set()): Template is not defined for the forcedPhotCcd_metadata dataset type, it must be set before it can be used.
johannct commented 5 years ago

For the final WARN comment, I can see

forcedPhotCcd_metadata:
    persistable: PropertySet
    storage: YamlStorage
    python: lsst.daf.base.PropertySet
    tables: raw
    template: ''

in /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2018_39/stack/miniconda3-4.5.4-fcd27eb/Linux64/obs_base/16.0-14-g71e547a+4/policy/datasets.yaml

boutigny commented 5 years ago

This is what I was worried about in my previous comment. Apparently we cannot run CModel in forcedPhotCcd yet. Here is a comment extracted from cmodelContinued.py:

The CModel algorithm currently cannot be run in forced mode when the measurement WCS is different
    from the reference WCS (as is the case in CCD forced photometry).  This is a temporary limitation
    that will be addressed on DM-5405.
johannct commented 5 years ago

@jchiang87 Jim, this is also something you wanted to inquire about I reckon.

wmwv commented 5 years ago

We can skip running CModel for the forced photometry. The DPDD only calls for

objectId
ccdVisitId
psFlux
psFluxErr
psDiffFlux
psDiffFluxErr
flags
wmwv commented 5 years ago

As a minor side point, forcedPhotCcd.py . --rerun coadd-v4:forced-test --id visit=179996 filter=u^g^r^i^z^y

can just be

forcedPhotCcd.py . --rerun coadd-v4:forced-test --id visit=179996 filter=u

Because visit 179996 is a u-band visit. And, you really don't have to specify filter at all. --id visit=179996 is sufficient.

But indeed to help my own thinking I do find it useful to specify both visit and filter in test scripts to remind myself what filter I'm looking at.

johannct commented 5 years ago

indeed, filter can be left out entirely. On the other hand before the crash I see a lot of

forcedPhotCcd WARN: Skipping reference 22276114168708587 (child of 22276114168678429) with bad Footprint
heather999 commented 5 years ago

Is it time to post this to the #dm-lsstCam channel to get some additional information from DM? Jim Bosch is out on paternity leave, so we need to depend on others to offer details.

wmwv commented 5 years ago

forcedPhotCcd WARN: Skipping reference 22276114168708587 (child of 22276114168678429) with bad Footprint

That's likely telling you that the coadd had an object that is on the edge of the detector at the per-visit level. The footprint is bad because it overlaps an edge (or otherwise has too many masked pixels).

wmwv commented 5 years ago

The obs_base setting of:

forcedPhotCcd_metadata:
    persistable: PropertySet
    storage: YamlStorage
    python: lsst.daf.base.PropertySet
    tables: raw
    template: ''

means the template value needs to be defined in an obs_* package before it can be used. E.g., in obs_subaru, for HSC it's defined here:

https://github.com/lsst/obs_subaru/blob/master/policy/HscMapper.yaml#L458

  forcedPhotCcd_metadata:
    template: '%(pointing)05d/%(filter)s/tract%(tract)d/forcedPhotCcd_metadata/%(visit)07d-%(ccd)03d.yaml'

We need to define the same kind of template for obs_lsst.

johannct commented 5 years ago

@wmwv that would make sense for this particular visit, but it is quite noisy

wmwv commented 5 years ago

@wmwv that would make sense for this particular visit, but it is quite noisy

I would suggest filtering them out, then. You can parse the log files with some greps to take out common lines that you don't care about.

wmwv commented 5 years ago

To record on this thread a suggestion I made to @johannct privately

Remove the line that loads the cModel profile in the following config:

https://github.com/lsst/obs_lsst/blob/master/config/forcedPhotCcd.py

config.load(os.path.join(getPackageDir("obs_lsst"), "config", "cmodel.py"))

TallJimbo commented 5 years ago

A few quick thoughts:

I probably won't be able to follow up on that, whatever it is, but some combination of @RobertLuptonTheGood, @yalsayyad, and @laurenam probably could.

johannct commented 5 years ago

ok switching off cModel in the config and adding a template for forcedPhotCcd_metadata in lsstCamMapper.yaml seems to do the job.

johannct commented 5 years ago

https://jira.lsstcorp.org/browse/DM-18185 for the chatty level

johannct commented 5 years ago

argh, no one more serious warning I think : forcedPhotCcd WARN: Could not persist metadata for dataId=DataId(initialdata={'visit': 179996, 'filter': 'u', 'raftName': 'R01', 'detectorName': 'S01', 'detector': 1, 'tract': 5066}, tag=set()): no such column: ccd somewhere the code needs a renaming of the sensor naming

wmwv commented 5 years ago

What's your template string?

johannct commented 5 years ago

I can see fits files being persisted nevertheless, but I presume that this needs fixing. It may already have been downstream of w39... but the incorrect example is still there in https://github.com/lsst/meas_base/blob/8acd1882360c4f83412889cb90121961e36c024a/python/lsst/meas/base/forcedPhotCcd.py#L387

Michael got it right again, I had a bad string formatting with ccd in the template, and got sidetracked by the example in the code

johannct commented 5 years ago

http://srs.slac.stanford.edu/Pipeline-II/exp/LSST-DESC/task.jsp?refreshRate=60&task=52615118&refreshIsOn=true&refreshCount=95 Run1.2p : stream 0 of task DC2DM_4_FORCEDCCD version 0.1 1 job per visit, using -j 8 as this is not a pipe_driver task.... not sure what the result will be in terms of parallelism

johannct commented 5 years ago

task completed : 21 failed jobs with a fatal msg. See below total wall time : 58h22m with 125 jobs using 8 cores each (option -j as forcedPhotCcd is not a pipe_driver task; need to study this config) ; total 1000 cores available. Global summary:

[tanugi@cca009 ~]$ qacct -A forcedCcd
Total System Usage
    WALLCLOCK         UTIME         STIME           CPU             MEMORY                 IO                IOW
================================================================================================================
     24590329 134260071.856   1095555.924 1490966006.525      444121207.284         488336.806          45837.740

@airnandez files to transfer under /sps/lsst/dataproducts/desc/DC2/Run1.2p/w_2018_39/rerun/coadd-v4/forced/ (when the 21 failures are understood) @wmwv I would suggest if at all possible to script the DPDD part for these files separately from the coadd catalogue part, so that I can run it after each visit is completed.

johannct commented 5 years ago

The FATAL have an error message of this kind : forcedPhotCcd FATAL: Failed on dataId=DataId(initialdata={'visit': 193901, 'filter': 'r', 'raftName': 'R23', 'detectorName': 'S22', 'detector': 107, 'tract': 5065}, tag=set()): TaskError: Reference {'tract': 5065, 'patch': '3,6'} doesn't exist

johannct commented 5 years ago

A total of 6 tracts and 9 patches are concerned: 5065 3,6 5066 3,6 2,5 4636 1,6 5064 4,6 6,6 4429 4,0 5,1 5062 1,6 THese are all ptches outside of the footprint. These tract/patches do exist under rerun/coadd-v4/deepCoadd-results/merged/. I for several of those and they are not in my visit to tract/patch mapping DB, so I am afraid that we have some sort of mapping inconsistency...

johannct commented 5 years ago

Note that the stack code goes on and complete the other detectors, and clearly the forcedPhotCcd results on these failed patches are likely useless.... So for the sake of moving ahead with testing the DPDD creation from these I would contend that transfer is OK. @wmwv ?

johannct commented 5 years ago

@rearmstr Bob do you have any suggestion as to what could go bad?

airnandez commented 5 years ago

@airnandez files to transfer under /sps/lsst/dataproducts/desc/DC2/Run1.2p/w_2018_39/rerun/coadd-v4/forced/ (when the 21 failures are understood)

Please let me know when you are ready for us to transfer the data to NERSC.

wmwv commented 5 years ago

@airnandez Yes, please transfer to NERSC.

@johannct Yes, the DPDD extraction will be run after the coadds. It will probably be in two steps.

johannct commented 5 years ago

@airnandez Bob just gave me a hint as to how to roll the failed stream back for clean living, so I will let you know as soon as they are done : hint is to add -c references.skipMissing=True to the command line. Thiq fixed the FATAL occurrences in the 21 visits. Transfer occurring asap.

johannct commented 5 years ago

Preparing to run the same code on 1.2i. I want to use this opportunity to test two things :

airnandez commented 5 years ago

[2019-03-11] Run1.2i DRP: IN2P3 → NERSC data transfer campaign

This is the summary of a transfer campaign of data products of Run1.2i from IN2P3 to NERSC, as requested by @johannct above:

@airnandez files to transfer under /sps/lsst/dataproducts/desc/DC2/Run1.2p/w_2018_39/rerun/coadd-v4/forced/ (when the 21 failures are understood)

Description Value
Location at CC-IN2P3 (sources) /sps/lsst/dataproducts/desc/DC2/Run1.2p/w_2018_39/rerun/coadd-v4
Location at NERSC (destination) /global/projecta/projectdirs/lsst/global/in2p3/Run1.2i/w_2018_39/rerun/coadd-v4
Number of files transferred 248,303
Data volume transferred 338 GB

The list of transferred files is located at NERSC at:

/global/projecta/projectdirs/lsst/global/in2p3/Run1.2i/logs/2019-03-11-in2p3-to-nersc.txt

wmwv commented 5 years ago

@airnandez The location above at NERSC is the incorrect location. That was the Run1.2i directory tree.

Please put these Run1.2p forced files in

/global/projecta/projectdirs/lsst/global/in2p3/Run1.2p/w_2018_39/rerun/coadd-v4/

A move on the NERSC filesystem rather than re-transfer is likely the most efficient thing to do:

mv /global/projecta/projectdirs/lsst/global/in2p3/Run1.2i/w_2018_39/rerun/coadd-v4/forced /global/projecta/projectdirs/lsst/global/in2p3/Run1.2p/w_2018_39/rerun/coadd-v4/

I tried but do not have permission.

heather999 commented 5 years ago

@wmwv and @airnandez Let's wait a moment before making this move. We already have /global/projecta/projectdirs/lsst/global/in2p3/Run1.2p/w_2018_39/rerun/coadd-v4/ which is the last Run1.2p reprocessing completed back in January. Admittedly this transfer should just entail adding the subdirectory forced, but do we want to maintain some sense that this is an addition to the "old" data or are we content to just add the forced photometry data? Going forward I'd probably want to have some indication that this is an addition and update the version of the DRP output to be coadd-v5.. but obviously we want to match IN2P3 and leaving this as coadd-v4 is probably ok. Next time, I'd probably use a different rerun when the data is produced which chained coadd-v4 so it is more clear that this is a follow-on processing.

I can complete this move - assuming we are all in agreement and understand what we're doing.

johannct commented 5 years ago

forced has been run with the same codebase. There is no DRP version as it is a separate task that has been run. It was already the case for calexp-v4 and coadd-v4. So I am not sure that it is critical to track this info here.

wmwv commented 5 years ago

@heather999 I agree with the spirit of your concern. But in this particular case, I think just putting it in coadd-v4 is the right thing to do.

heather999 commented 5 years ago

@wmwv @johannct @airnandez I just moved the forced directory to /global/projecta/projectdirs/lsst/global/in2p3/Run1.2p/w_2018_39/rerun/coadd-v4/forced I"ll also update the copy of the data on $CSCRATCH/../desc/DC2/data/Run1.2p

johannct commented 5 years ago

thanks Heather, I am now waiting for some feedback before I launch the 1.2i version (assuming we want it)

johannct commented 5 years ago

I think that we decided to move forward : test forcedPhotCcd on top of coadd and move on with 2.1i, so I am closing this.