mlops-labs-team1 / engineering.labs

The project completed for MLops Engineering Lab #1 by Team #1. See our wiki for more info
https://github.com/mlops-labs-team1/engineering.labs/wiki
16 stars 6 forks source link

Dockerize TorchServe #6

Closed anayden closed 3 years ago

anayden commented 3 years ago

Building a Dockerfile for TorchServe

anayden commented 3 years ago

Bulding an image:

docker build -t labs1-mlflow-torch --build-arg GCP_CREDS_JSON_BASE64="$(cat gcp.json.b64)" -f ./Dockerfile-torchserve .

Running

❯ docker run -e MODEL_NAME=/BertModel/5 -p 8080:8080 -p 8081:8081 -p 8082:8082 -it --rm labs1-mlflow-torch
2021-01-24 16:21:56,338 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.3.0
TS Home: /usr/local/lib/python3.8/site-packages
Current directory: /opt/mlflow
Temp directory: /tmp
Number of GPUs: 0
Number of CPUs: 2
Max heap size: 986 M
Python executable: /usr/local/bin/python
Config file: N/A
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /opt/mlflow/model_store
Initial Models: N/A
Log dir: /opt/mlflow/logs
Metrics dir: /opt/mlflow/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 2
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: false
Metrics report format: prometheus
Enable metrics API: true
2021-01-24 16:21:56,412 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2021-01-24 16:21:56,940 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2021-01-24 16:21:56,942 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2021-01-24 16:21:56,954 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2021-01-24 16:21:56,955 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2021-01-24 16:21:56,962 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2021-01-24 16:21:59,415 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /127.0.0.1:55134 "GET /models/news_classification/all HTTP/1.1" 404 32
2021-01-24 16:21:59,418 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests4XX.Count:1|#Level:Host|#hostname:3124802d1ff0,timestamp:null
/opt/mlflow/model_store/news_classification.mar file generated successfully
2021-01-24 16:23:48,425 [INFO ] epollEventLoopGroup-3-2 org.pytorch.serve.archive.ModelArchive - eTag b1668e30d1fe4516a2f80cff81c631dd
2021-01-24 16:23:48,458 [DEBUG] epollEventLoopGroup-3-2 org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model news_classification
2021-01-24 16:23:48,458 [DEBUG] epollEventLoopGroup-3-2 org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model news_classification
2021-01-24 16:23:48,459 [INFO ] epollEventLoopGroup-3-2 org.pytorch.serve.wlm.ModelManager - Model news_classification loaded.
2021-01-24 16:23:48,462 [DEBUG] epollEventLoopGroup-3-2 org.pytorch.serve.wlm.ModelManager - updateModel: news_classification, count: 1
2021-01-24 16:23:48,738 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /tmp/.ts.sock.9000
2021-01-24 16:23:48,740 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]46
2021-01-24 16:23:48,742 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2021-01-24 16:23:48,744 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-news_classification_1.0 State change null -> WORKER_STARTED
2021-01-24 16:23:48,744 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.8.1
2021-01-24 16:23:48,758 [INFO ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /tmp/.ts.sock.9000
2021-01-24 16:23:48,860 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /tmp/.ts.sock.9000.
2021-01-24 16:23:52,370 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2021-01-24 16:23:52,371 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2021-01-24 16:23:52,371 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_service_worker.py", line 182, in <module>
2021-01-24 16:23:52,372 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     worker.run_server()
2021-01-24 16:23:52,373 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_service_worker.py", line 154, in run_server
2021-01-24 16:23:52,373 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.handle_connection(cl_socket)
2021-01-24 16:23:52,374 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_service_worker.py", line 116, in handle_connection
2021-01-24 16:23:52,375 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service, result, code = self.load_model(msg)
2021-01-24 16:23:52,376 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_service_worker.py", line 89, in load_model
2021-01-24 16:23:52,377 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     service = model_loader.load(model_name, model_dir, handler, gpu, batch_size, envelope)
2021-01-24 16:23:52,377 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_loader.py", line 104, in load
2021-01-24 16:23:52,378 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn(service.context)
2021-01-24 16:23:52,378 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/ts/model_loader.py", line 131, in <lambda>
2021-01-24 16:23:52,387 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     initialize_fn = lambda ctx: entry_point(None, ctx)
2021-01-24 16:23:52,391 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/b1668e30d1fe4516a2f80cff81c631dd/news_classifier_handler.py", line 133, in handle
2021-01-24 16:23:52,387 [INFO ] epollEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2021-01-24 16:23:52,392 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     _service.initialize(context)
2021-01-24 16:23:52,393 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/tmp/models/b1668e30d1fe4516a2f80cff81c631dd/news_classifier_handler.py", line 51, in initialize
2021-01-24 16:23:52,394 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2021-01-24 16:23:52,394 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     self.model = torch.load(model_pt_path, map_location=self.device)
2021-01-24 16:23:52,395 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/torch/serialization.py", line 594, in load
2021-01-24 16:23:52,396 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
2021-01-24 16:23:52,396 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -   File "/usr/local/lib/python3.8/site-packages/torch/serialization.py", line 853, in _load
2021-01-24 16:23:52,397 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle -     result = unpickler.load()
2021-01-24 16:23:52,398 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - TypeError: an integer is required (got type bytes)
2021-01-24 16:23:52,402 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
        at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
        at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
        at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:188)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
2021-01-24 16:23:52,425 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: news_classification, error: Worker died.
2021-01-24 16:23:52,426 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.ModelVersionedRefs - Removed model: news_classification version: 1.0
2021-01-24 16:23:52,436 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-news_classification_1.0 State change WORKER_STARTED -> WORKER_SCALED_DOWN
2021-01-24 16:23:52,437 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-news_classification_1.0-stderr
2021-01-24 16:23:52,438 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-news_classification_1.0-stdout
2021-01-24 16:23:52,441 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkLoadManager - WorkerThread interrupted during waitFor, possible async resource cleanup.
2021-01-24 16:23:52,442 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model news_classification
2021-01-24 16:23:52,444 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-news_classification_1.0 State change WORKER_SCALED_DOWN -> WORKER_STOPPED
2021-01-24 16:23:52,445 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-news_classification_1.0-stderr
2021-01-24 16:23:52,446 [WARN ] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-news_classification_1.0-stdout
2021-01-24 16:23:52,447 [DEBUG] W-9000-news_classification_1.0 org.pytorch.serve.wlm.WorkerThread - Worker terminated due to scale-down call.
2021-01-24 16:23:52,450 [INFO ] epollEventLoopGroup-3-2 ACCESS_LOG - /127.0.0.1:55178 "POST /models?url=/opt/mlflow/model_store/news_classification.mar&initial_workers=1 HTTP/1.1" 500 26497
2021-01-24 16:23:52,452 [INFO ] epollEventLoopGroup-3-2 TS_METRICS - Requests5XX.Count:1|#Level:Host|#hostname:3124802d1ff0,timestamp:null
Traceback (most recent call last):
  File "/usr/local/bin/mlflow", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/mlflow/deployments/cli.py", line 134, in create_deployment
    deployment = client.create_deployment(name, model_uri, flavor, config=config_dict)
  File "/usr/local/lib/python3.8/site-packages/mlflow_torchserve/__init__.py", line 114, in create_deployment
    self.__register_model(
  File "/usr/local/lib/python3.8/site-packages/mlflow_torchserve/__init__.py", line 408, in __register_model
    raise Exception("Unable to register the model")
Exception: Unable to register the model
2021-01-24 16:23:52,492 [INFO ] W-9000-news_classification_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-news_classification_1.0-stderr
2021-01-24 16:23:52,497 [INFO ] W-9000-news_classification_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-news_classification_1.0-stdout