HSC pipeline single-visit runs with multiple associated coadds

msimet commented 9 years ago

@HironaoMiyatake noticed yesterday that if we run StileVisit.py on a single visit which has multiple associated coadds, we end up running the data multiple times, as if the same CCD with a different coadd calibration was a separate and independent CCD. (As a reminder, when possible we use the broader-level coadd calibration instead of the single-pass calibration, which is why we're touching the coadds at all here.)

This is obviously not desired behavior; the question is what to do about it?

Options:

Allow people to select a specific coadd they'd like to use; if they don't, have some algorithm to pick which coadd will be used.
Separate by coadd as well as by visit, make the filenames represent this, and run the data once per associated coadd.
Something else?

Happy to take thoughts even from people who haven't worked with this aspect of the code: it's sort of a philosophical choice (do we want to test all the possible calibrations? If we don't, is it misleading to merely select one?).

TallJimbo commented 9 years ago

Separating by tract as well as visit is the way we've handled this in the pipeline, and it's what I'd recommend here. You could also make it so a tract would be selected by default for any visit if not specified, but I do think you want to include the tract in the filename.

HironaoMiyatake commented 9 years ago

Sorry, I do not understand very well. Why do we need a tract for analyzing visit? To get better WCS?

msimet commented 9 years ago

I believe the coadds have better photometric calibration--Jim could be more precise about why, I'm sure.

I'll set about adding the tract to the filenames, then. Thinking on this more, I think we should do all the tracts by default, but let the user specify just one if they want to limit themselves (is there a way to do this with the current command-line switches? Just a tract argument to the data ID?).

TallJimbo commented 9 years ago

Sorry, I do not understand very well. Why do we need a tract for analyzing visit? To get better WCS?

Essentially, yes - it's so we can use the meas_mosaic-derived astrometric and photometric calibration, which is done at a tract level.

is there a way to do this with the current command-line switches? Just a tract argument to the data ID?

It's a little more complicated than that, but we already have code to help with it. Check out how forcedPhotCcd.py creates its argument parser in pipe_tasks, in particular the use of PerTractDataIdContainer. If you use that, it should do what you need.

HironaoMiyatake commented 9 years ago

So the calibrations differ tract by tract? That's tricky..., which is kind of mixing visit-level and coadd-level process. For simplicity, we could start working on the visit level without tract information, i.e., StileVisitNoTract.py. This is slightly off-topic, though...

msimet commented 9 years ago

forcedPhotCcd.py creates its argument parser in pipe_tasks, in particular the use of PerTractDataIdContainer. If you use that, it should do what you need.

Okay, that's helpful, will do.

For simplicity, we could start working on the visit level without tract information, i.e., StileVisitNoTract.py.

Sounds good. So I'll put this as high priority, but not must-be-done-immediately priority!

TallJimbo commented 9 years ago

So the calibrations differ tract by tract? That's tricky..., which is kind of mixing visit-level and coadd-level process

At this point, it's a computational necessity, because we can't run meas_mosaic on areas much larger than that. Even if we do have a faster or more efficient ubercal code in the future, we won't ever be able to calibrate the entire survey at once, I think.

msimet / Stile

HSC pipeline single-visit runs with multiple associated coadds #48