neo4j-partners / hands-on-lab-neo4j-and-bedrock

Hands on lab for Neo4j and Amazon Bedrock
Apache License 2.0
38 stars 40 forks source link

Autopilot Run Time #2

Closed benofben closed 10 months ago

benofben commented 2 years ago

The Autopilot job is taking about an hour to run. We reduced the jobs from 5 to 3. We tried setting the stopping parameters that take times but that seems to cause some outputs to not be printed. So, that isn't going to work.

Rumi suggested two other things: (1) Size up the machine (2) Cut the dataset down

I'll try those and see how quick we can get it.

benofben commented 2 years ago

I truncated the data down to 10k rows and the runtime remained unchanged. Per a conversation with Rumi, it seems the runtime is dominated by deployment of machines, not actually processing.

I'm not sure how to scale the Autopilot infrastructure up. It seems serverless. Open question...

The only thing that remains per the Rumi conversation is to switch to another algorithm. I'd rather not as Autopilot is the key SageMaker feature. I'm going to work with SageMaker PM and request a lower runtime Autopilot invocation to validate infra. I think something analogous to terraform plan versus terraform apply would be useful.

benofben commented 2 years ago

I put some notes notes about leaving the job running into the notebooks.

benofben commented 2 years ago

It looks like maybe the notebook doesn't continue running if the browser is closed.

benofben commented 2 years ago

Resummarizing the issue ---

There are many use cases that require quick run times for building a machine learning model

To this end, SageMaker Autopilot should offer the ability to train a low quality model in ~5 minutes. This would enable all these use cases.

benofben commented 10 months ago

We're stripping autopilot out of the new version of the lab, so this is now irrelevant.