Pliers is a fantastic looking toolkit for converting a wide variety of data formats into feature vectors. This seems like a perfect sub-routine for hypertools, e.g. via the format_data function. Idea:
User passes in messy data in mixed formats
Inside format_data we:
Group "like" data (e.g. text with text, images with images, numbers with numbers, etc.). This is tricky...we first need to come up with an appropriate set of categories. (E.g. does audio count as "text" because we can apply speech-to-text on it? Or does it count as a matrix like any other? Or a separate audio category? And who makes that decision? And if we make that decision for users, will it be user-controllable?)
convert all data within each group into numpy arrays, using pliers as needed/appropriate
Pliers is a fantastic looking toolkit for converting a wide variety of data formats into feature vectors. This seems like a perfect sub-routine for hypertools, e.g. via the
format_data
function. Idea:format_data
we: