How to access features and target of final processed dataset?

Apologies for the delayed response. For features: When you execute batteryml run configs/baselines/sklearn/variance_model/matr_1.yaml ./workspace/test --train --eval, the data will be cached in the BatteryML/cache folder. The cached files will have names in the format battery_cache_xx.pkl.

To load the data from these cached pickle files, you can use the following keys:

data['dataset']: This contains the features.
data['raw_data']: This contains the processed raw data.

The naming of the cache files is based on a hash of the config settings in the config.yaml file. We use the hash_string function in BatteryML/batteryml/pipeline.py for this.

For predictions: As for the predictions, they are stored in BatteryML/workspaces/<your model>/<dataset>/predictions_seed_n_xx.pkl. Here, and should be replaced with the name of your model and dataset, respectively.

I hope this clarifies your question. Let me know if you have any further queries.

microsoft / BatteryML

How to access features and target of final processed dataset? #28