Wishlist of the interface

Sets of inputs we want to support for comparison:

Input table properties:

Utility functions:

cross_match(catalog1, catalog2, match_parameters) --> match_info (see @yymao's FoF format or HSC/DM Stack heavy footprint @mpaillassa, or options below)
count_matches(match_info) --> table of group number (plus None) and number of instances of group number,
match_stats(match_info) --> basic statistics (mean, median, mode, etc.) of main input catalog properties (flux, flux errors, etc.)

Matching options:

Early step:

Given catalogs, a variable radius and objects that could be potential matches.
FoF algorithm: list of groups, each group is a list of ids in the two catalogs. Yao's implementation
Something hierarchical, two classes of metrics: one object and its prediction, group of objects and aggregate
Assume subsampling is happening somewhere else for now. Figure out cuts when quantities uncertain (future)
Main utility function: Some function that pre-process them and matches them to create groups
What about objects that are not matched with anything? and what about left over galaxies?
List of groups from DM stack can be used too
Decide on format of groups even if function for makign groups is not set in stone
Pixel maps in HSC with galaxy identiy (heavy footprint), every pixel associated with a galaxy identity. Good to understand this data structure.
- What is the data structure of heavy footprints in the LSST pipelines @mpaillassa ?
- Not depend on DM stack?
Non-uniform length list of possible matches-> utility function that does counting on list of lists (number of true in pixel)
Data structure of matches in id space, but also true and detection catalog need a master version that has more columns 'id'

LSSTDESC / lost-and-found