Closed theoschutt closed 7 months ago
Counter-proposal:
extra_cols = ['G1_DATA', 'G2_DATA', 'G1_MODEL', 'G2_MODEL']
qcat = treecorr.Catalog(filename="catalog.fits", g1_eval='G1_DATA-G1_MODEL', g2_eval='G2_DATA-G2_MODEL', ...)
The issue is that TreeCorr's I/O currently wants to know the names of all the columns to read in at the start. Mostly in case the input catalog has tons of columns, it only reads the ones it will actually use. (fitsio and hdf5 can both be efficient at this.) I think this way could be made to work in that manner. We'd add these extra column names to the all_cols list, and then those variables would exist for the evals to use.
I see. This makes sense to me!
Done on #173
As we discussed earlier, this is a feature request for creating a Catalog with quantities derived from the input catalog. This would allow the user to avoid making in advance a separate catalog with the derived quantities as columns.
One simple use case is subtracting the mean from an input column before running the correlation. This alone as a boolean flag would be useful. The more general use case would be allowing any operation on one or more columns. For example,
G1_DATA
-G1_MODEL
or (G1_DATA
-G1_MODEL
)*(T_DATA
-T_MODEL
)/T_DATA
, as we need to do for rho and tau statistics.Something like allowing:
where the column names defined in the functions,
G1_DATA
,G1_MODEL
, etc, must be columns incatalog.fits
. I'll keep thinking on it, and happy to help out with implementing!