Should this return the whole dataframe or just the new column? If the former more care needs to be given to the column names.

adambrown1 commented 7 years ago

At the moment the dataframe returned by the process function can be rather confusing. For example, if I apply the S2AreaFractionTop cut I get the column with a name 'S2AreaFractionTop' in addition to the hax column 's2_area_fraction_top', where the different format of the name distinguishes between the actual area fraction top and the cut on this.

If roughly this interface is going to be kept the columns need to be called something more specific, preferably with "Cut" somewhere in the name and even better, with some sort of name mangling (maybe lax at the start). This is particularly important for the temp column which can be used internally by lax and is then deleted - if the user has a column called temp, this will be deleted if they apply any cuts using lax.

My preference would be not to add any columns to the original dataframe at all, and just to return a new column containing the boolean values, which the user can then call whatever they like, or apply directly via df = df[lax.lichens.Lichen().process(df)] or using hax.cuts

tunnell commented 7 years ago

So add "Cut" at end?

adambrown1 commented 7 years ago

Have done for S2AreaFractionTop. Will make a pull request to change the other later

tunnell commented 7 years ago

Added word Cut

XENON1T / lax

Should this return the whole dataframe or just the new column? If the former more care needs to be given to the column names. #12