Open AwePhD opened 2 months ago
I wrongly understood the multi fidelity directories.
When NePS is using multi fidelity, it creates folder like config_{number_config}_{number_fidelity}
. So pipeline_directory
and previous
pipeline_directoryare always different. Therefore,
previous_pipeline_directory` is intuitive.
I rename the issue to have a better documentation of the directories structure when performing multi fidelity. I think it might be clearer. In my opinion, it is hard to predict that the directories have this structure beforehand. I think an illustration with some paragraphs might guide the new user in a relevant way? Although, the current documentation is straightforward to see that two run_pipeline
calls have different directories. It's just a bit vague to me.
Sorry for the delay in response. Not sure why my notifications for this library are disabled -_- Honestly appreciate the feedback and we'll try to get back to you sooner!
Glad you understood it in the end and yes your interpretation is correct. The main reason to have it in different folders is lost to time but it does make logging of configurations and results much easier to post-process, which is how the library originally was benchmarked. It also helps a bit with paths for file locking (how the parallelism works with arbitrary number of workers), preventing some edge cases.
Thanks for the issue and we'll keep it on the todo-list of things to do. Right now, a lot of the internals are being revamped to make it more performant, usable and lean. One thing that will be revisited is how we handle multi-fidelity. I imagine we'll likely keep the same folder structure and we can document it as so once it's done, including the specifics of the previous pipeline directory.
Some extras:
We'd like to explore many-fidelity soon, such as not just scaling epochsx but also something like depth/width.
One benefit of the current pipeline directory approach (as opposed to re-using the directory) is that in a many-fidelity setup, we may ask the user to load a model from an arbitrary checkpoint, and the {config_id}_{fidelity}
naming scheme no longer makes sense.
Great thanks for the feedback, I was not sure about the relevance of my issues. Keep up with the good work!
Hi,
I have a question about the choice of the argument
previous_pipeline_directory
inrun_pipeline.
I browsed the code and it seems that the optimizer is responsible to get the previous trial, since it's theOptimizer
's responsibility to sample trials.Although, my question is why the argument is not
has_previous_fidelity_trial
with typebool
, or something alike?I do not know how you manage the workers and the multiprocessing for distributed HPO. So maybe, in some situations, the directory of the previous (fidelity) trial of the same config does not have the same directory of the current one? Or maybe there is a more profound reason that I am not aware.
Note that I am not an HPO practitioner, so my understanding of NePS and PriorBand is fairly limited. I just want to apply HPO on a deep learning model for my research.
The question is more about a sanity check for me, that I understood correctly the documentation about multi-fidelity. The most related piece of documentation that I found is this subsection and the multi fidelity page. Maybe a dedicated didactic page on multi fidelity might be good? The two examples are rich and simple, which is very good. But it might be a bit rough to grasp from a DL perspective, namely not familiar with the multi-fidelity HPO (SH, HP, PB ...). Or maybe it's just/only my personal lack of understanding.
Best, Mathias.