Feature names might not match

LSSTDESC / RESSPECT

The RESSPECT project is a result from an inter-collaboration agreement established between the Cosmostatistics Initiative (COIN) and the LSST Dark Energy Science Collaboration (DESC) with the goal of developing a recommendation system for telescope resource allocation able to optimize photometric supernova cosmology anaylsis.

MIT License

1 stars 0 forks source link

Feature names might not match #64

Closed drewoldag closed 1 week ago

drewoldag commented 2 weeks ago

I noticed that the feature names defined in fit_lightcurves.py:401 (here) don't match the feature names in bazin.py (here)

I just want to check how precisely these names need to match.

Specifically the difference is: ['a', 'b', 't0', 'tfall', 'trise'] vs. ['A', 'B', 't0', 'tfall', 'trise']

drewoldag commented 2 weeks ago

If they need to match exactly, then we can do the following: Move the features_names that are defined in the three LightCurve subclasses up to be class member variables (so that you can access them as Bazin.features_names.

And then use the FEATURE_EXTRACTOR_REGISTRY to get the proper extractor.

For instance:

from resspect.feature_extractors.light_curve import FEATURE_EXTRACTOR_REGISTRY

def fit(..., feature_extractor: str = "Bazin", ...):
    ...
    feature_extractor = FEATURE_EXTRACTOR_REGISTRY[feature_extractor]
    features = feature_extractor.features_names
    ...

drewoldag commented 2 weeks ago

Similarly there are a few other places in fit_lightcurve.py where I see code blocks like the following:

if feature_extractor == 'bazin':
    header = TOM_FEATURES_HEADER
elif feature_extractor == 'malanchev':
    header = TOM_MALANCHEV_FEATURES_HEADER

It would be nice to consolidate this logic.

drewoldag commented 2 weeks ago

Additionally, in lightcurves_utils.py there are many *_HEADER lists defined. It seems like all of these are just the standard features_names from the LightCurve subclass for each of the different bands, plus the following boilerplate column names: id', 'redshift', 'type', 'code', 'orig_sample'.

If that's the case, then we can programmatically define these columns as was done in https://github.com/LSSTDESC/RESSPECT/pull/56

drewoldag commented 1 week ago

For now, we'll update the features_names in .../feature_extractors/bazin.py to be upper case. The rest of the comments in here will be put on hold, and reconsidered while incorporating the LAISS work.