Closed jonathanburns closed 3 years ago
Dear Jonathan,
Thank you very much for your interest in AutoPyTorch. I think your ultimate goal would be to work on HpBandSter and not directly on AutoPyTorch. However, please note that we don't maintain HpBandSter anymore; but our collaboration partner Bosch will release a new version of HpBandSter (in a new repo) within the next months. I think that you should talk with them then.
(Of course, we will also have a major update on AutoPyTorch then.)
Best, Marius
Thanks @mlindauer
I'd love to sync with you on the upcoming changes, if you're willing to share details. I'm hoping to contribute to this project. I'd also love to sync with Bosch if they have the time and are looking for adopters :)
Sorry, I'm not involved in the new HpBandSter. (My focus is more on SMAC3 (among other things))
@mlindauer Thanks, SMAC seems very interesting, I'm going to look further at that. I'm actually interested in the upcoming changes to autoPyTorch. I exploring making a large time investment in this project, and would love to know more about the direction.
I'd love to give you demo so that you can see it running on Flyte. The demo looks very promising so far.
Hello all, ✋
I really loved your amazing book and the work you’ve done here. I work on the “Flyte” project at Lyft and I’ve been playing around with adapting AutoPyTorch to work on Flyte.
I’m very passionate about open-source ML workflows (see here to understand what i'm doing).
I sort-of hacked the AutoPyTorch codebase in order to prove that it can work on Flyte in a distributed way. Now that I’ve done that, I’m interested in doing it the right way and I’d love to chat with someone to discuss.
Flyte handles a lot of the overhead of distributed systems (ensuring data gets from one task to the next), so I modified AutoPytorch to remove all the network calls and master/worker logic.
The new logic no longer needs a master/worker setup. Instead it runs a series of ephemeral containers, and Flyte handles moving the data from one task to the next.
The logic looks something like this (where each "-" step is a separate container):
In Flyte, you also get lots of visibility into where failures happen in a nice UI. It’s pretty cool to see it all working. I can tell exactly which set of HPs failed (if any).
Unfortunately, the master/worker logic is pretty baked into AutoPyTorch (and HpBandSter) right now. I had to do some very hacky things to get the POC working. I’m hoping I can chat with someone about how I can do this more properly.