Doc improvement suggestion

OverLordGoldDragon commented 5 years ago

Reading through the API essentials, the functionalities are fairly well documented - except for how HyperparemeterHunter (HH) optimizes hyperparameters. Reading enough, I figured out the basics, and I'm sure the entire API is well-figurable - but the idea is, the "hunting" aspect of HH isn't as 'emphasized' or promptly explained. The two questions I found answers the last to are ones I sought out from the beginning:

How to specify which hyperparameters to optimize?
How to specify the search range?

E.g., the Keras example imports Real - but there's no way to tell what "Real" does without reading the docs; a more intuitive name would be RealSearchRange, or from search_range import Real - else I figure it's a form of type casting. -- I intend on learning the API further, but currently my question is: I use my own training loop class, which takes care of the following:

Training, via train_on_batch
Validation, via predict (using outputs to programmatically compute F1-score, loss, etc)
Data pipeline - all data preprocessed, and shuffled at each epoch
- Checkpointing/logging - best model per F1-score, logging history, etc

Is it possible to set up HH to only do hyperparameter search? I don't mind its other functionalities, so long as they don't conflict with those of my own

OverLordGoldDragon commented 5 years ago

P.S., I created a PR with comment edits that I'd find helpful when starting out with HH, as an example.

Also, the Feature Engineering section contains a verbatim duplicate of a large portion of text, almost consecutively - unsure if intended; image excerpt

HunterMcGushion commented 5 years ago

Thanks for opening this issue! You make an excellent point, and I think it'd be a good idea to add a section in the README to clarify that search ranges are specified with Real, Integer and Categorical. For now, I'd like to keep the names as-is, unless others also feel that the naming is confusing. I think that in the space of hyperparameter optimization, the names are fairly appropriate, especially since they were taken verbatim from the Scikit-Optimize project. Also, I'd prefer to keep the names shorter. However, I really appreciate your feedback on the names, and I'm certainly open to further discussion if you still disagree. I think additional documentation (like you added in PR #196) would go a long way--especially if it's in a prominent place like the README.

I'll keep discussion on PR #196 in its comment section.

Turning to your question on using HH to only do hyperparameter search, could you elaborate a bit? Are you trying to use the OptPros without relying on Environment and CVExperiment, and bypassing the automatic Experiment result matching that goes along with them? Or do you simply want to add some custom functionality as with lambda_callback? I apologize if I'm missing your point.

HunterMcGushion commented 5 years ago

Ah thank you for catching the duplicated text in the "Feature Engineering" docs. It looks like Sphinx is copying the FeatureEngineer.__init__ docstring and displaying it for both the FeatureEngineer class and for the FeatureEngineer.__init__ method. Definitely not intentional, and I'll look into that!

OverLordGoldDragon commented 5 years ago

Regarding the naming, I'd say at least one of the two would prove quite helpful: (1) comments on examples (as in the PR); (2) importing from a submodule - i.e. e.g., from search_space import Real - to indicate that 'Real' is related to hyperparameter search rather than type-casting.

To clarify my question - in essence, I already took care of every aspect of training, and wish to use HH only for hyperparameter search - as a sort-of 'drop-in' addon. In pseudo-code,

while epochs < 5:
   while times_trained < trains_before_val:
       x, y = get_train_data() # gets 'next' data, like a generator - but isn't a generator
       train_loss = model.train_on_batch(x,y)
       times_trained += 1
   do_validation()

def do_validation():
    x, y = get_val_data()
    preds = model.predict(x)
    val_loss = loss_fn(y,preds)

Is there anywhere I can 'insert' HH above to do hyperparameter search?

HunterMcGushion / hyperparameter_hunter

Doc improvement suggestion #195