DataGeometry instances provide a way to apply the same transformations to new data, or re-apply transformations to old data. There are several issues with the current implementation. I've outlined what I think are the biggest issues below (these should probably be divided into separate issues).
Exposing all hypertools functions from DataGeometry instances
In addition to plot and transform, I'd like to see reduce, normalize, and align exposed in DataGeometry instances. In other words, geo.plot(...) should either re-plot old data or plot new data, filling in all arguments with the previously (or newly) specified ones. Similarly, geo.reduce(...) should apply dimensionality reduction to new (or old) data, filling in arguments as appropriate. And same with geo.normalize(...) and geo.align(...). These functions should all behave essentially like hyp.plot, hyp.reduce, hyp.normalize, and hyp.align but with the ability to re-use already specified arguments and data. The existing DataGeometry.transform function provides an additional convenience-- a mechanism for easily re-applying the full pipeline of reduce/normalize/align transformations to new (or old) data.
Argument parsing in DataGeometryplot, reduce, normalize, align, and transform functions
In the existing implementation, there are several inconsistencies with how arguments are parsed between hypertools.plot and DataGeometry.plot. For example, DataGeometry.plot does not accept format strings, whereas hypertools.plot does.
I propose that we change the implementation by writing parse_arguments helper functions (these won't be exposed to the user-- they should be private functions) for plot, reduce, normalize, and align. Each function should return a dictionary of parsed arguments (with defaults filled in) appropriate to the given function. Both the hypertools and DataGeometry functions should parse arguments in the exact same way (by calling these helper functions). The difference is that for the DataGeometry versions of those functions, the "defaults" should be replaced by any previously specified arguments. In other words we should to something like the following:
The first time we call geo = hyp.plot(data, ...., reduce=...., normalize=..., align=...), we use the argument parsing functions to fill in defaults wherever the user didn't specify that the defaults should be replaced with something else. The (internal) result is a dictionary for each of plot, reduce, normalize, and align that contains all arguments for each function, organized as a dictionary. These should be saved inside the geo object, but not accessible to the user.
When geo.plot, geo.reduce, geo.normalize, geo.align, or geo.transform are called in the future, those already-parsed argument dictionaries should act like the "default" values of those arguments. Any additional arguments that the user passes into those functions should replace those new defaults. But otherwise the functions under geo should behave just like their hyp counterparts in terms of how they deal with argument parsing.
DataGeometry
instances provide a way to apply the same transformations to new data, or re-apply transformations to old data. There are several issues with the current implementation. I've outlined what I think are the biggest issues below (these should probably be divided into separate issues).Exposing all hypertools functions from
DataGeometry
instancesIn addition to
plot
andtransform
, I'd like to seereduce
,normalize
, andalign
exposed inDataGeometry
instances. In other words,geo.plot(...)
should either re-plot old data or plot new data, filling in all arguments with the previously (or newly) specified ones. Similarly,geo.reduce(...)
should apply dimensionality reduction to new (or old) data, filling in arguments as appropriate. And same withgeo.normalize(...)
andgeo.align(...)
. These functions should all behave essentially likehyp.plot
,hyp.reduce
,hyp.normalize
, andhyp.align
but with the ability to re-use already specified arguments and data. The existingDataGeometry.transform
function provides an additional convenience-- a mechanism for easily re-applying the full pipeline of reduce/normalize/align transformations to new (or old) data.Argument parsing in
DataGeometry
plot
,reduce
,normalize
,align
, andtransform
functionsIn the existing implementation, there are several inconsistencies with how arguments are parsed between
hypertools.plot
andDataGeometry.plot
. For example,DataGeometry.plot
does not accept format strings, whereashypertools.plot
does.I propose that we change the implementation by writing
parse_arguments
helper functions (these won't be exposed to the user-- they should be private functions) forplot
,reduce
,normalize
, andalign
. Each function should return a dictionary of parsed arguments (with defaults filled in) appropriate to the given function. Both thehypertools
andDataGeometry
functions should parse arguments in the exact same way (by calling these helper functions). The difference is that for theDataGeometry
versions of those functions, the "defaults" should be replaced by any previously specified arguments. In other words we should to something like the following:geo = hyp.plot(data, ...., reduce=...., normalize=..., align=...)
, we use the argument parsing functions to fill in defaults wherever the user didn't specify that the defaults should be replaced with something else. The (internal) result is a dictionary for each ofplot
,reduce
,normalize
, andalign
that contains all arguments for each function, organized as a dictionary. These should be saved inside thegeo
object, but not accessible to the user.geo.plot
,geo.reduce
,geo.normalize
,geo.align
, orgeo.transform
are called in the future, those already-parsed argument dictionaries should act like the "default" values of those arguments. Any additional arguments that the user passes into those functions should replace those new defaults. But otherwise the functions undergeo
should behave just like theirhyp
counterparts in terms of how they deal with argument parsing.