Open jsaladich opened 2 months ago
Hi @jsaladich,
run_times
increases if a trial fails. The oracle does this: self._run_times[trial.trial_id] += 1
when it fails.. Maybe it should be called retries
. Any ideas here?
For multiple executions of a trial, the values are averaged..
average_trial_executions
which is True
by default, to control the logs.The current code returns (i added comments):
Hi @ghsanti thanks a lot for your exhaustive response, and sorry for the delay in the response.
Before answering you I need to know, the json you just posted is showing loss metric per step (i.e. epochs). That is great, but as far as I remember (haven't used KT much recently) the KT engine selects the best step.
My concern is about the executions_per_trial
, I need to know the best_score_value
for each execution per trial.
Of course, having full traceability (i.e. metrics per each step and per each execution) would be the best.
Please, let me know if we are in the same page
Thanks a lot!
@jsaladich
Working on PR (see below), feel free to comment.
Hi @ghsanti amazing job and sorry for not following up (but belive me, I read you). I would need a week to make an experiment again so I can tell you accurately my user experience and my expectations. Would you mind waiting or you need answer asap?
Thanks! No rush @jsaladich; do it whenever you can, and if you cant it's fine as well.
@ghsanti i would never miss such opportunity!!
Hi @ghsanti sorry for the delay, just run some dummy KT optimizations:
As of version 1.4.7
we have the score
as a single value, which is repeated from the metrics value (depending on user selection, val_loss
or loss
.
Assuming the user asked for executions_per_trial = 10
, we are missing in metrics 10 val_loss
and loss
values that could be very useful. If you add the whole training epochs (I believe is what you suggested in your json with the structure in obsevations
it is also a nice to have feature.
In oracle.json
, I have seen a confusing key run_times
. To me it seems to be related with executions_per_trial
, but then I saw there is another parameter max_retries_per_trial
. If I understand properly, run_times
is the real max_retries_per_trial
for each experiment, not the number of executions_per_trial
. It would be nice to have a record of the executions_per_trial
in oracle too.
Finally, I understand that executions_per_trial
is the number of re-fit / re-train in a given trial. But then, there is the number of re-predict. Which is; for each trial
and for each execution_per_trial
how many times should the algorithm predict the output. Thus we can have a full benchmark of the uncertainty of the network being optimized.
Let me know if my explanation is understandable!
Thanks a lot for your time and patience!
Hi, changes are in my fork only; it wont be wanted here bc i removed all backwards compatibility.
They may still want to support it here (but I don't see anyone replying.)
The fork aims for keras>=3.5 and tf>=2.17. (Note that it's not finished, but it may work for simple projects.)
Here I included sample-outputs. (I think some of those you mentioned are fixed.)
Yes, I belive that will help a lot any user of KT.
Perhaps another topic (that might require more dev. work) is the number of times that model.predict()
should be ran in the same execution for a given trial.
But, long story short, this is a nice implementation with the current state of KT! Thanks a lot @ghsanti !!! kudos for you!
P.S: Quick question, shouldn't the score
of your sample json (i.e. https://github.com/ghsanti/keras-tuner/blob/main/example-results/test/trial_05/trial.json) match with the trial's average of the 3 executions instead of the trial's best execution (as it is now)?
You are welcome 🤗
P.S: Quick question, shouldn't the
score
of your sample json (i.e. https://github.com/ghsanti/keras-tuner/blob/main/example-results/test/trial_05/trial.json) match with the trial's average of the 3 executions instead of the trial's best execution (as it is now)?
That's a valid point, currently it's logged that way for simplicity i.e just keeps one best-overall value (until i get the rest working reliably); I'll take a closer look at it during the next week, once i fix some failing tests.
Feel free to open a discussion or issue in my fork as well, for any other changes.
Hi KerasTuner team!
Describe the bug I ran an experiment with
keras_tuner.BayesianOptimization
in whichexecutions_per_trial=3
. When I check the file./oracle.json
I realize that the fieldrun_times
is always equal to 1.Moreover, the files
./../trial.json
of each trial only return 1 best score and a single value in metric.Expected behavior I would expect two things to behave differently:
oracle.json
file should return each trial withrun_times=3
if the user requestedexecutions_per_trial=3
in the configurationtrial.json
file should return a list of lenghtlen(executions_per_trial)
containing the scores / metrics for each execution per trial, so the user can analyze better the algorithm.Am I missing something or this is how it works? Thanks!