AIworx-Labs / chocolate

A fully decentralized hyperparameter optimization framework
http://chocolate.readthedocs.io
BSD 3-Clause "New" or "Revised" License
121 stars 41 forks source link

Duplicate samples for chocolate.Bayes #36

Open c3-dennis opened 4 years ago

c3-dennis commented 4 years ago

Reproducible Example:

import chocolate as choco

def objective_function(alpha, l1_ratio):
    return alpha + l1_ratio

space = {
    "alpha": choco.quantized_uniform(0.1, 0.3, 0.1),
    "l1_ratio": choco.quantized_uniform(0.5, 1.0, 0.1)
}

conn = choco.DataFrameConnection()

sampler = choco.Bayes(conn, space)

samples = []
for i in range(20):
    token, params = sampler.next()
    samples.append((token, params))
    loss = objective_function(**params)
    sampler.update(token, loss)

for sample in samples:
    print(sample)

Output:

({'_chocolate_id': 0}, {'alpha': 0.1, 'l1_ratio': 0.8})
({'_chocolate_id': 1}, {'alpha': 0.2, 'l1_ratio': 0.9})
({'_chocolate_id': 2}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 3}, {'alpha': 0.2, 'l1_ratio': 0.5})
({'_chocolate_id': 4}, {'alpha': 0.1, 'l1_ratio': 0.8})
({'_chocolate_id': 5}, {'alpha': 0.2, 'l1_ratio': 0.9})
({'_chocolate_id': 6}, {'alpha': 0.2, 'l1_ratio': 0.6})
({'_chocolate_id': 7}, {'alpha': 0.1, 'l1_ratio': 0.8})
({'_chocolate_id': 8}, {'alpha': 0.1, 'l1_ratio': 0.7})
({'_chocolate_id': 9}, {'alpha': 0.1, 'l1_ratio': 0.6})
({'_chocolate_id': 10}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 11}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 12}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 13}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 14}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 15}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 16}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 17}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 18}, {'alpha': 0.1, 'l1_ratio': 0.5})
({'_chocolate_id': 19}, {'alpha': 0.1, 'l1_ratio': 0.5})

Note the repetition of id 0 and 1 and id 4 and 5, respectively.

Comments: I took a peak at the implementation and found that

  1. During bootstrapping phase (n=10 default), there is no duplicate protection for samples that are randomly drawn.

  2. During gaussian phase, there doesn't seem to be duplicate protection either. This is probably ok, as it would indicate convergence, but I thought I would bring it up anyway.

I can't think of a scenario where this duplication would be desirable behavior, so I am reporting this as an issue.