When using osprey dump -o json I was wondering if it'd be more useful to store each of the parameters in separate name/value pairs, instead of in a single entry as a dictionary.
That way, when loading the json with pd.read_json for example, each of the parameters would be stored in a column. That feels more natural to me and allows for easier plotting: plt.scatter(df['tica__lag_time'], df['mean_test_score'])
At the moment, to 'extract' each of the parameters in a different DataFrame, I have to do something like this:
import pandas as pd
df = pd.read_json('dump.json')
params = pd.DataFrame(columns=list(df['parameters'][0].keys()))
for i, hyp_parms in enumerate(df['parameters']):
params.loc[i] = hyp_parms
# Scatter plot with a single command
plt.scatter(params['tica__lag_time'], df['mean_test_score'])
If you think this is a good idea, maybe you could point me out what needs to be changed and I'd be happy to implement it.
When using
osprey dump -o json
I was wondering if it'd be more useful to store each of theparameters
in separate name/value pairs, instead of in a single entry as a dictionary.That way, when loading the json with
pd.read_json
for example, each of the parameters would be stored in a column. That feels more natural to me and allows for easier plotting:plt.scatter(df['tica__lag_time'], df['mean_test_score'])
At the moment, to 'extract' each of the parameters in a different DataFrame, I have to do something like this:
If you think this is a good idea, maybe you could point me out what needs to be changed and I'd be happy to implement it.