msimet / Stile

Stile: the Systematics Tests In Lensing pipeline
BSD 3-Clause "New" or "Revised" License
9 stars 6 forks source link

#26 Tract and patch-level HSC interface #46

Closed msimet closed 9 years ago

msimet commented 9 years ago

This pull request adds code to run Stile on the tract- and patch-level outputs of the HSC pipeline (the coadds).

The functionality is analogous to the CCD and visit-level Stile scripts: you simply use calls like bin/StilePatch.py $DATA_DIR --id tract=0 patch=3,7 filter=HSC-I --rerun=$your_rerun The calls are similar for StileTract.py, which runs on tracts, the largest coadd block made up of many patches, analogous to the single visit made up of many CCDs.

Mostly this branch involves moving some code around into separate functions that we can override in child classes, plus adding some arguments to look for coadd-level data products (which tend to start with deepCoadd_). There's also a simple fix for a string representing combinations of individual patches, which isn't as straightforward as combining CCDs since it's a 2-d space.

Couple of questions remaining:

rmandelb commented 9 years ago

The PSF is based on a coaddition of single epoch PSFs, and those might have been determined from different sets of stars in the individual epochs, so I suspect that’s why you’re not finding that flag.

HironaoMiyatake commented 9 years ago

I think Rachel is right. Melanie, are you running Stile on an old version of HSC pipeline? I requested the pipeline team to propagate star flag in chip catalog into coadd catalog, but I am not sure if it's already done or not. Anyway Bob is working on coadd, so I'll ask what flag is he using now.

HironaoMiyatake commented 9 years ago

For the tract level, do we need to catch the flag that says the object is in its primary tract (since they overlap)?

I think so. If we use objects with that flag=True, we do not get objects on a tract only once after reading all the patches (Melanie, this is for taking care of overlap between patches, not tracts, I think). Otherwise objects in overlapped regions between patches are used twice.

HironaoMiyatake commented 9 years ago

@msimet , did you make selection detect.is-primary=True for the entire coadd level? I could not find that... It is need for avoiding objects in overlapping regions between patches.

HironaoMiyatake commented 9 years ago

I found that column CCD is generated for the tract level. We do not need that, or change that to patch?

HironaoMiyatake commented 9 years ago

I got the following error when calculating flux.psf. I could not find a line masking objects with flux.psf.flags == True, which I guess causes this error. Should we add such a line?

Traceback (most recent call last):
  File "/tigress/HSC/users/miyatake/Stile/bin/StileTract.py", line 5, in <module>
    TractSingleEpochStileTask.parseAndRun()
  File "/tigress/HSC/products-20130212/Linux64/pipe_base/HSC-3.0.0/python/lsst/pipe/base/cmdLineTask.py", line 353, in parseAndRun
    resultList = taskRunner.run(parsedCmd)
  File "/tigress/HSC/products-20130212/Linux64/pipe_base/HSC-3.0.0/python/lsst/pipe/base/cmdLineTask.py", line 157, in run
    resultList = mapFunc(self, self.getTargetList(parsedCmd))
  File "/tigress/HSC/users/miyatake/Stile/stile/hsc/base_tasks.py", line 971, in __call__
    result = task.run(*args)
  File "/tigress/HSC/users/miyatake/Stile/stile/hsc/base_tasks.py", line 798, in run
    results = sys_test(self.config, *new_catalogs)
  File "/tigress/HSC/users/miyatake/Stile/stile/hsc/sys_test_adapters.py", line 304, in __call__
    return self.sys_test(*data, verbose=True, **kwargs)
  File "/tigress/HSC/users/miyatake/Stile/stile/sys_tests.py", line 763, in __call__
    raise RuntimeError("NaN or Inf values detected in input array!")
RuntimeError: NaN or Inf values detected in input array!
msimet commented 9 years ago

I found that column CCD is generated for the tract level. We do not need that, or change that to patch?

The SysTest expects something called a CCD to do the averaging on, so we make a fake 'CCD' column that contains patch numbers instead of CCDs--there's a note about this buried somewhere in the source code, is there somewhere I should put it that would be more noticeable/helpful?

msimet commented 9 years ago

I got the following error when calculating flux.psf. I could not find a line masking objects with flux.psf.flags == True, which I guess causes this error.

I think this line exists: it's line 580 in computeExtraColumn, where the key here is flux.psf.flags, as you can see just above that line. Is it possible there are NaNs that don't set this flag? Is it some other NaN we're not prepared for?

msimet commented 9 years ago

By the way, our Config arguments that used to be flags are now flags_keep_true and flags_keep_false, since we could conceivably want to do either case. Does anyone object to this change, or have better names/ideas how to do this?

msimet commented 9 years ago

Tagging the usual suspects (@HironaoMiyatake, @TallJimbo, @rmandelb)--anyone want to take a look at the changes here, so we can get this branch merged?

HironaoMiyatake commented 9 years ago

Sorry that I'm being silent, but I'll take a look early this week (hopefully tomorrow)!

msimet commented 9 years ago

No worries, I know you're all busy with actual data things, but I branched off this for #45 and it'll be easier for you all to see the code if this is merged first. :)

HironaoMiyatake commented 9 years ago

The SysTest expects something called a CCD to do the averaging on, so we make a fake 'CCD' column that contains patch numbers instead of CCDs--there's a note about this buried somewhere in the source code, is there somewhere I should put it that would be more noticeable/helpful?

I found the description at the line 958, but putting that into doctoring of TractSingleEpochStileTask could be better.

HironaoMiyatake commented 9 years ago

I think this line exists: it's line 580 in computeExtraColumn, where the key here is flux.psf.flags, as you can see just above that line. Is it possible there are NaNs that don't set this flag? Is it some other NaN we're not prepared for?

I looked into this a bit, and found that the line exists there, but it does not go through because of line 348 in base_tasks.py. Probably it should be taken care of in adapters. What do you think?

HironaoMiyatake commented 9 years ago

I made the following fix in my local repository. If you are comfortable with this change, I'll push this.

class StatsPSFFluxAdapter(ShapeSysTestAdapter):
    """                                                                                                                                                                                                                        
    Adapter for the StatSysTest.  See the documentation for that class or BaseSysTestAdapter for                                                                                                                               
    more information.  In this case, we specifically request 'flux.psf' and object_type 'galaxy'.                                                                                                                              

    In the future, we plan to have this be more configurable; for now, this works as a test.                                                                                                                                   
    """
    def __init__(self, config):
        self.config = config
        self.sys_test = sys_tests.StatSysTest(field='flux.psf')
        self.name = self.sys_test.short_name+'flux.psf'
        self.mask_funcs = [self.MaskPSFFlux]

    def MaskPSFFlux(self, data, config):
        base_mask = mask_dict['galaxy'](data, config)
        try:
            additional_mask = data['flux.psf.flags']==0
        except LsstCppException:
            key = data.schema.find('flux.psf.flags').key
            additional_mask = numpy.array([src.get(key)==False for src in data])
        return numpy.logical_and(base_mask, additional_mask)

    def getRequiredColumns(self):
        return (('flux.psf',),)

    def __call__(self, task_config, *data, **kwargs):
        return self.sys_test(*data, verbose=True, **kwargs)
msimet commented 9 years ago

Ah, good catch, Hironao!

I'd change this line in the __init__ block

self.mask_funcs = [self.MaskPSFFlux]

to

self.mask_funcs = [mask_dict[obj_type] for obj_type in ['galaxy']]
self.mask_funcs += [self.MaskPSFFlux]

to make sure it only picks out the stars, but otherwise that looks good to me.

msimet commented 9 years ago

Er, that should have said galaxies, not stars, sorry.

HironaoMiyatake commented 9 years ago

I found that it your change does not work. I guess self.mask_funcs should be a list of object types, i.e., galaxy, star, etc..? Anyway, my code includes galaxy mask in MaskPSFFlux, so my code should work fine.

Traceback (most recent call last):
  File "/tigress/HSC/users/miyatake/Stile/bin/StileTract.py", line 5, in <module>
    TractSingleEpochStileTask.parseAndRun()
  File "/tigress/HSC/products-20130212/Linux64/pipe_base/HSC-3.0.0d_hsc/python/lsst/pipe/base/cmdLineTask.py", line 353, in parseAndRun
    resultList = taskRunner.run(parsedCmd)
  File "/tigress/HSC/products-20130212/Linux64/pipe_base/HSC-3.0.0d_hsc/python/lsst/pipe/base/cmdLineTask.py", line 157, in run
    resultList = mapFunc(self, self.getTargetList(parsedCmd))
  File "/tigress/HSC/users/miyatake/Stile/stile/hsc/base_tasks.py", line 949, in __call__
    result = task.run(*args)
  File "/tigress/HSC/users/miyatake/Stile/stile/hsc/base_tasks.py", line 761, in run
    for i in range(len(temp_mask_list[0]))]
msimet commented 9 years ago

Oh, yes, sorry, you're totally right (and sorry I didn't see the galaxy masking line in your previous code)--there should only be one mask per object type, whatever they are. The rest of it looks fine, please feel free to add it to this branch.

HironaoMiyatake commented 9 years ago

Okay, I pushed the change. Also I added detect.is-primary flag to tract level.

I think this branch is in a good shape, and we can merge this into master!

msimet commented 9 years ago

Okay! I need to add the bit about the fake CCD column to the documentation, but I'll merge later tonight after I do that (unless someone else objects). Thanks for your careful review.