superlinear-ai / poetry-cookiecutter

🍪 Poetry Cookiecutter is a modern Cookiecutter template for scaffolding Python packages and apps
GNU Affero General Public License v3.0
252 stars 37 forks source link

Adding Terraform to Azure App Service setup #10

Open Jerenaux opened 2 years ago

Jerenaux commented 2 years ago

For the Darts demo, Tanguy added code to Terraform the app through the Gitlab CI/CD.

Would it be valuable to have an Azure App Service Terraform setup included in Cookiecutter (maybe as an optional choice when initializing a new project)? I see a lot of value in having an automated way to deploy e.g. Streamlit apps to App Service, but maybe it’s too much, wdyt?

lsorber commented 2 years ago

I think that's a great idea: there's definitely a gap that Poetry Cookiecutter doesn't fill yet with regards to infrastructure provisioning. Another gap that we can fill is being able to manage multiple packages simultaneously, such as an API server and its associated API client – i.e. a 'mini' monorepo.

That being said, I think we should discuss (on- or offline) whether we should include this functionality as part of Poetry Cookiecutter, or on a different level.

lsorber commented 1 year ago

@JWuzyk let’s give this some thought. I’d like to see how we can make infra more easy. For AI we also have to consider the training phase, which I’d like to be able to run on on-demand cloud infrastructure.

JWuzyk commented 1 year ago

Would indeed be nice to get infra for free as well. Adds a lot of complexity to an already complex cookiecutter though. Especially since you'd probably want a few different deployment options. Some example code with setup instructions would probably be almost as good. Also would be easier to add to existing projects.

For ML training I see a few options:

  1. Training in CI - Remote training could be handled by the same infra running our CI pipelines. We could add special training runners by extending the Runners Terraform without adding too much to the cookiecutter directly. CML looks like another easy way to do it. Probably also need to integrate DVC (more infra for storage) or something like that to manage data then. Pretty clunky to use for experimentation though IMO.
  2. Use an ML/pipelining framework - many framework allow you to define pipelines and easily run these on cloud resources just by specifiying runners. Some also automate deployment so app service wouldn't be necessary in many cases. Could be really powerful but requires us to commit to a framework. We also give up the level of customisation we have managing infra ourselves.
  3. Custom infra - A basic version of this would be including setup for using a VM with a GPU (very easy to do with AzureML) or using AzureML experiments purely as a task runner. More complicated would be something like scripts to send jobs to Azure Batch. (I think Jerome already did this?). Probably the most work but could be very flexible. Could even build a basic package out of it and import it in the cookiecutter.
lsorber commented 1 year ago

Great, thanks for the input, I'll plan a meeting with you to discuss next week!