statmike / vertex-ai-mlops

Google Cloud Platform Vertex AI end-to-end workflows for machine learning operations
Apache License 2.0
450 stars 202 forks source link

Error while Configuring Local Docker to Use GCLOUD CLI #48

Open varshasahasrabuddhe opened 1 year ago

varshasahasrabuddhe commented 1 year ago

Hi Mike, I am working on notebook "Vertex AI Custom Model - Prophet - Custom Job With Custom Container". Below command throws an error

!gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet

Error message - WARNING: docker not in system PATH. docker and docker-credential-gcloud need to be in the same PATH in order to work correctly together. gcloud's Docker credential helper can be configured but it will not work until this is corrected. Adding credentials for: us-west1-docker.pkg.dev Docker configuration file updated.

Any idea, how to go about this?

statmike commented 1 year ago

Hi @varshasahasrabuddhe , Thank you for reaching out. Were you using a user-managed workbench instance or managed workbench instance? Thank You, Mike

varshasahasrabuddhe commented 1 year ago

Hi Mike, Thank you for your prompt response.

yes i figured that out, it works in User Managed workbench instance. Any idea why it doesnt in Managed workbench instance?

Also now i am getting below errors on Docker Run command !docker run {IMAGE_URI} --PROJECT_ID {PROJECT_ID} --DATANAME {DATANAME} --NOTEBOOK {NOTEBOOK} --horizon {14} --no-yearly

Errors - model-stock-management applied_forecasting 04f 0%| | 0/12 [00:00<?, ?it/s]23:01:57 - cmdstanpy - INFO - Chain [1] start processing 23:01:57 - cmdstanpy - INFO - Chain [1] start processing 23:01:57 - cmdstanpy - INFO - Chain [1] start processing 23:01:57 - cmdstanpy - INFO - Chain [1] done processing 23:01:57 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 11 Unknown error -11 0%| | 0/12 [00:00<?, ?it/s]23:01:57 - cmdstanpy - INFO - Chain [1] done processing 23:01:57 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 11 Unknown error -11 23:01:57 - cmdstanpy - INFO - Chain [1] done processing 23:01:57 - cmdstanpy - ERROR - Chain [1] error: terminated by signal 11 Unknown error -11

multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, kwds)) File "/fit/prophet.py", line 48, in run_prophet p.fit(series) File "/opt/conda/lib/python3.7/site-packages/prophet/forecaster.py", line 1181, in fit self.params = self.stan_backend.fit(stan_init, dat, kwargs) File "/opt/conda/lib/python3.7/site-packages/prophet/models.py", line 94, in fit raise e File "/opt/conda/lib/python3.7/site-packages/prophet/models.py", line 90, in fit self.stan_fit = self.model.optimize(**args) File "/opt/conda/lib/python3.7/site-packages/cmdstanpy/model.py", line 738, in optimize raise RuntimeError(msg) RuntimeError: Error during optimization! Command '/opt/conda/lib/python3.7/site-packages/prophet/stan_model/prophet_model.bin random seed=52644 data file=/tmp/tmptt1na6uf/ouw39fzc.json init=/tmp/tmptt1na6uf/q6k325gk.json output file=/tmp/tmptt1na6uf/prophet_modeli11498bg/prophet_model-20230627230157.csv method=optimize algorithm=newton iter=10000' failed: """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/opt/conda/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/fit/prophet.py", line 55, in predictions = list(tqdm(pool.imap(run_prophet, seriesFrames), total = len(seriesFrames))) File "/opt/conda/lib/python3.7/site-packages/tqdm/std.py", line 1178, in iter for obj in iterable: File "/opt/conda/lib/python3.7/multiprocessing/pool.py", line 748, in next raise value RuntimeError: Error during optimization! Command '/opt/conda/lib/python3.7/site-packages/prophet/stan_model/prophet_model.bin random seed=52644 data file=/tmp/tmptt1na6uf/ouw39fzc.json init=/tmp/tmptt1na6uf/q6k325gk.json output file=/tmp/tmptt1na6uf/prophet_modeli11498bg/prophet_model-20230627230157.csv method=optimize algorithm=newton iter=10000' failed: