byu-dml / metalearn

BYU's python library of useable tools for metalearning
MIT License
22 stars 6 forks source link

Timing for processing x_raw #196

Open emrysshevek opened 5 years ago

emrysshevek commented 5 years ago

We time almost every function, including ones that only return intermediary steps such as _sample_columns in order to return an accurate time for each metafeature. However, we don't time how long it takes to drop nan values from x_raw in:

"X": self._format_resource(X.dropna(axis=1, how="all"), 0.)

This would likely be only a very slight increase in time as this is such a simple function, but since X is used by so many metafeatures, it would be valuable to have as accurate a time as possible.

We should pull that computation out of the dictionary so we can time it and include the proper time.

emrysshevek commented 5 years ago

This also applies to the seed base. If it is not provided by the user, we compute our own with:

seed = np.random.randint(np.iinfo(np.int32).max)

This function should probably be made into a ResourceComputer for consistency and timed.