Closed OmriPi closed 3 years ago
@OmriPi: The docker image doesn't have the transformers package pre-installed. Hence the error:
ModuleNotFoundError: No module named 'transformers'
You can do the following to resolve this issue:
install_py_dep_per_model
and supply the model-specific python packages in a requirements.txt file using --requirements
flag while creating the model-archive@OmriPi: The docker image doesn't have the transformers package pre-installed. Hence the error:
ModuleNotFoundError: No module named 'transformers'
You can do the following to resolve this issue:
* Install transformers in the docker container * Or set [`install_py_dep_per_model`](https://github.com/pytorch/serve/blob/master/docs/configuration.md#allow-model-specific-custom-python-packages) and supply the model-specific python packages in a requirements.txt file using `--requirements` flag while creating the model-archive
@harshbafna Thanks!
Could you please elaborate on how exactly do I go about the second option?
I have added a requirements.txt
file containing only the line transformers==3.4.0
and a config.properties
file containing the lines:
install_py_dep_per_model=true
load_models=DocTag=DocTag.mar
model_store=/home/model-server/model-store
I have not used the config.properties
file before, so I'm not sure I'm doing it right.
I create the model archive using the following command:
torch-model-archiver --model-name DocTag --version 1.0 --serialized-file ./Model/pytorch_model.bin --handler ./handler.py --extra-files "./Model/config.json,./Model/vocab.txt,./Model/tokenizer_config.json,./Model/special_tokens_map.json,./index_to_name.json,./input_features.py,config.properties" --requirements requirements.txt --export-path ./model_store -f
I'm getting the following error log now, even when running locally (which previously worked):
Torchserve version: 0.2.0
TS Home: D:\Programming\anaconda3\envs\myenv\Lib\site-packages
Current directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger
Temp directory: C:\Users\6E8C~1\AppData\Local\Temp
Number of GPUs: 1
Number of CPUs: 8
Max heap size: 4060 M
Python executable: d:\programming\anaconda3\envs\myenv\python.exe
Config file: config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\model_store
Initial Models: DocTag=DocTag.mar
Log dir: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\logs
Metrics dir: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 1
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
2020-12-08 15:54:07,183 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: DocTag.mar
2020-12-08 15:54:19,939 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 4056ce0416e04581a5252838e36c3879
2020-12-08 15:54:19,951 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model DocTag
2020-12-08 15:54:19,951 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model DocTag
2020-12-08 15:54:19,951 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model DocTag loaded.
2020-12-08 15:54:46,802 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: DocTag, count: 1
2020-12-08 15:54:46,815 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: NioServerSocketChannel.
2020-12-08 15:54:46,960 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: None
2020-12-08 15:54:46,962 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]1632
2020-12-08 15:54:46,962 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-08 15:54:46,962 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.7.6
2020-12-08 15:54:46,962 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-08 15:54:46,966 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /127.0.0.1:9000
2020-12-08 15:54:47,095 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2020-12-08 15:54:47,096 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: NioServerSocketChannel.
2020-12-08 15:54:47,098 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2020-12-08 15:54:47,098 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: NioServerSocketChannel.
2020-12-08 15:54:47,100 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
2020-12-08 15:54:47,103 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: ('127.0.0.1', 9000).
Model server started.
2020-12-08 15:54:47,235 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,236 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:161.1764373779297|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,237 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:315.76105880737305|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,237 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:66.2|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,237 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:9137.2734375|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,238 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:7101.64453125|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,240 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:43.7|#Level:Host|#hostname:DESKTOP-6SPF5B8,timestamp:1607435687
2020-12-08 15:54:47,457 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2020-12-08 15:54:47,458 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-12-08 15:54:47,458 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\site-packages\ts\model_loader.py", line 84, in load
2020-12-08 15:54:47,458 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - module = importlib.import_module(module_name)
2020-12-08 15:54:47,459 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\importlib\__init__.py", line 127, in import_module
2020-12-08 15:54:47,459 [INFO ] nioEventLoopGroup-5-1 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2020-12-08 15:54:47,459 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - return _bootstrap._gcd_import(name[level:], package, level)
2020-12-08 15:54:47,460 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
2020-12-08 15:54:47,460 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2020-12-08 15:54:47,460 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 983, in _find_and_load
2020-12-08 15:54:47,461 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:129)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
2020-12-08 15:54:47,461 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
2020-12-08 15:54:47,463 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: DocTag, error: Worker died.
2020-12-08 15:54:47,463 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
2020-12-08 15:54:47,463 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2020-12-08 15:54:47,464 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap_external>", line 728, in exec_module
2020-12-08 15:54:47,464 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stderr
2020-12-08 15:54:47,464 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
2020-12-08 15:54:47,465 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stdout
2020-12-08 15:54:47,465 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "C:\Users\????\AppData\Local\Temp\models\4056ce0416e04581a5252838e36c3879\handler.py", line 3, in <module>
2020-12-08 15:54:47,466 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stdout
2020-12-08 15:54:47,466 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
2020-12-08 15:54:47,466 [INFO ] W-9000-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stderr
2020-12-08 15:54:48,613 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: None
2020-12-08 15:54:48,614 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]4344
2020-12-08 15:54:48,614 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-08 15:54:48,615 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2020-12-08 15:54:48,615 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.7.6
2020-12-08 15:54:48,615 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /127.0.0.1:9000
2020-12-08 15:54:48,618 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: ('127.0.0.1', 9000).
2020-12-08 15:54:48,924 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2020-12-08 15:54:48,924 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-12-08 15:54:48,924 [INFO ] nioEventLoopGroup-5-2 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2020-12-08 15:54:48,925 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\site-packages\ts\model_loader.py", line 84, in load
2020-12-08 15:54:48,925 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2020-12-08 15:54:48,926 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - module = importlib.import_module(module_name)
2020-12-08 15:54:48,926 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:129)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
2020-12-08 15:54:48,926 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\importlib\__init__.py", line 127, in import_module
2020-12-08 15:54:48,927 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: DocTag, error: Worker died.
2020-12-08 15:54:48,927 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - return _bootstrap._gcd_import(name[level:], package, level)
2020-12-08 15:54:48,927 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2020-12-08 15:54:48,927 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
2020-12-08 15:54:48,928 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stderr
2020-12-08 15:54:48,929 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 983, in _find_and_load
2020-12-08 15:54:48,929 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stdout
2020-12-08 15:54:48,929 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
2020-12-08 15:54:48,930 [INFO ] W-9000-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stderr
2020-12-08 15:54:48,930 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
2020-12-08 15:54:48,931 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stdout
2020-12-08 15:54:49,639 [INFO ] nioEventLoopGroup-2-1 org.pytorch.serve.ModelServer - Inference model server stopped.
2020-12-08 15:54:49,641 [INFO ] nioEventLoopGroup-2-2 org.pytorch.serve.ModelServer - Management model server stopped.
2020-12-08 15:54:49,641 [INFO ] nioEventLoopGroup-2-1 org.pytorch.serve.ModelServer - Metrics model server stopped.
2020-12-08 15:54:50,129 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: None
2020-12-08 15:54:50,130 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]1680
2020-12-08 15:54:50,131 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-08 15:54:50,131 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STOPPED -> WORKER_STARTED
2020-12-08 15:54:50,131 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.7.6
2020-12-08 15:54:50,131 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /127.0.0.1:9000
2020-12-08 15:54:50,134 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: ('127.0.0.1', 9000).
2020-12-08 15:54:50,460 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Backend worker process died.
2020-12-08 15:54:50,460 [INFO ] nioEventLoopGroup-5-3 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2020-12-08 15:54:50,460 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Traceback (most recent call last):
2020-12-08 15:54:50,462 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2020-12-08 15:54:50,462 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\site-packages\ts\model_loader.py", line 84, in load
2020-12-08 15:54:50,462 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died.
java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056)
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133)
at java.base/java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432)
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:129)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
2020-12-08 15:54:50,462 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - module = importlib.import_module(module_name)
2020-12-08 15:54:50,463 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: DocTag, error: Worker died.
2020-12-08 15:54:50,463 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "d:\programming\anaconda3\envs\myenv\lib\importlib\__init__.py", line 127, in import_module
2020-12-08 15:54:50,463 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2020-12-08 15:54:50,464 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - return _bootstrap._gcd_import(name[level:], package, level)
2020-12-08 15:54:50,464 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stderr
2020-12-08 15:54:50,465 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
2020-12-08 15:54:50,465 [WARN ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-DocTag_1.0-stdout
2020-12-08 15:54:50,465 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - File "<frozen importlib._bootstrap>", line 983, in _find_and_load
2020-12-08 15:54:50,466 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 2 seconds.
2020-12-08 15:54:50,466 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stdout
2020-12-08 15:54:50,466 [INFO ] W-9000-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-DocTag_1.0-stderr
2020-12-08 15:54:51,750 [INFO ] main org.pytorch.serve.ModelServer - Torchserve stopped.
What am I missing here? Why did even the local model server stop working? Thanks!
@OmriPi:
Your configuration looks fine. Also, the flag for installing the model-specific pip packages is properly initialized:
Custom python dependency for model allowed: true
From the above logs, it seems that TorchServe was not able to load the handler module and ran into some import
related issues. Could you please share the handler
file?
I assume you have installed TorchServe from the source?
@harshbafna Thanks for your reply! I have installed TorchServe using pip, if that is what you mean by "from source". Here's the strange thing though: I have changed nothing in the handler file, that same handler worked perfectly fine without adding the config and requirements files, so it sounds strange that would be the cause. I'll add it here nonetheless, perhaps it could shed light on the problem:
import os
import logging
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from ts.torch_handler.base_handler import BaseHandler
from ts.utils.util import load_label_mapping, map_class_to_label
from input_features import InputFeatures
MAX_SEQ_LENGTH = 256
logger = logging.getLogger(__name__)
class DocumentTaggerHandler(BaseHandler):
def __init__(self):
super(DocumentTaggerHandler, self).__init__()
self.tokenizer = None
self.initialized = False
def initialize(self, context):
self.manifest = context.manifest
properties = context.system_properties
model_dir = properties.get("model_dir")
self.map_location = 'cuda' if torch.cuda.is_available() else 'cpu'
self.device = torch.device(self.map_location + ":" + str(properties.get("gpu_id"))
if torch.cuda.is_available() else self.map_location)
self.model = AutoModelForSequenceClassification.from_pretrained(model_dir)
self.tokenizer = AutoTokenizer.from_pretrained(model_dir)
self.model.to(self.device)
self.model.eval()
logger.debug('Transformer model from path {0} loaded successfully'.format(model_dir))
mapping_file_path = os.path.join(model_dir, "index_to_name.json")
self.mapping = load_label_mapping(mapping_file_path)
self.initialized = True
def _preprocess_one_document(self, req):
text = req.get('data')
if text is None:
text = req.get('body')
if type(text) != str:
text = text.decode('utf-8')
logger.info("Received text: '%s'", text)
tokens = self.tokenizer(text, max_length=MAX_SEQ_LENGTH, padding='max_length', truncation=True,
add_special_tokens=True, return_tensors='pt')
input_ids = tokens['input_ids']
segment_ids = tokens['token_type_ids']
input_mask = tokens['attention_mask']
assert len(input_ids[0]) == MAX_SEQ_LENGTH
assert len(input_mask[0]) == MAX_SEQ_LENGTH
assert len(segment_ids[0]) == MAX_SEQ_LENGTH
return InputFeatures(input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids)
def preprocess(self, data):
documents = [self._preprocess_one_document(req) for req in data]
input_ids = torch.cat([f.input_ids for f in documents]).to(self.device)
input_mask = torch.cat([f.input_mask for f in documents]).to(self.device)
segment_ids = torch.cat([f.segment_ids for f in documents]).to(self.device)
data = {
'input_ids': input_ids,
'input_mask': input_mask,
'segment_ids': segment_ids
}
return data
def inference(self, data, *args, **kwargs):
logits = self.model(data['input_ids'], data['input_mask'], data['segment_ids'])[0]
predicted_labels = []
predicted_labels.extend(torch.sigmoid(logits).round().long().cpu().detach().numpy())
return predicted_labels
def postprocess(self, data):
res = []
labels = map_class_to_label(data, mapping=self.mapping)
for i in range(len(labels)):
tags = [label[0] for label in labels[i].items() if label[1] > 0]
res.append({'label': tags, 'index': i})
logger.info("Prediction result: '%s'", res)
return res
I do not know whether it is related but I thought perhaps adding torch in the requirements file is needed too, so I've added it there, however when I do that the model fails to load and I get a different error message saying that "custom dependencies could not be installed" or something similar (don't have the exact text here), so I figured that's probably not what was missing.
Thanks!
@OmriPi: I don't see any problem with the handler. The only reason it could fail is if it couldn't find the transformers package.
Could you share the complete mar file, so that I can try reproducing and debug this error at my end?
@harshbafna Here is a link to the mar file: http://www.mediafire.com/file/71eys2g8kxxk938/DocTag.mar/file I do not think it fails because it couldn't find the transformers package, because I'm not even running this on docker, and the transformers package is installed locally. Previously before the addition of the requirements and config files, it DID run locally without any problem, using (I assume) the installed transformers package.
If you could take a look at it and solve this strange behavior, I would be very grateful. Thanks!
@OmriPi: I tested your mar file on my local set-up with the latest master and I am able to load the model with install_py_dep_per_model
set to true
as well as false
.
Also, are you running this on Windows? The v0.2.0 release is only certified on Linux, Mac, and Windows WSL.
@harshbafna Interesting, how can it work with false
if transformers is not included? (Unless you already have transformers installed locally)
Yes I'm running it on Windows. Is it possible that without the requirements file it worked on Windows, and adding it stops the model from working on Windows? If it's a feature of v0.2.0 which was not tested on Windows it might possibly be the cause. Trying it on WSL seems to be successful.
This leads me to a few questions:
1) Is the config.properties
file required in --extra-files
when creating the mar file?
2) Following the previous question, it seems including it there does nothing. When deploying the model on a docker container - the model ignores it and the config.properties
file which appears in the container is the default one. If so, how do I make my docker container use the correct config.properties
file? I thought including it in --extra-files
should do the trick but it seems this doesn't work...
I use the following command to run the docker container, with the mar file being the one I shared:
docker run --rm -it -p 8080:8080 -p 8081:8081 -v %cd%/model_store:/home/model-server/model-store pytorch/torchserve:latest torchserve --start --model-store model-store --models DocTag=DocTag.mar
Do I need to share the config.properties
file with the container the same way that I share the model store folder?
Sorry for all the questions, I'm a docker (and torchserve) beginner.
Thanks!
@harshbafna Interesting, how can it work with
false
if transformers is not included? (Unless you already have transformers installed locally)
I should have mentioned that I manually installed transformers before setting the parameter to false :-)
Yes I'm running it on Windows. Is it possible that without the requirements file it worked on Windows, and adding it stops the model from working on Windows? If it's a feature of v0.2.0 which was not tested on Windows it might possibly be the cause. Trying it on WSL seems to be successful.
You can try it on latest master which is now supported on Windows-10 PRO as well.
Is the
config.properties
file required in--extra-files
when creating the mar file?
config.properties
is used for configuring TorchServe and has no impact on model-archive creation. A mar file is a simple archive file (zip) to package all model-specific files.
2. Following the previous question, it seems including it there does nothing. When deploying the model on a docker container - the model ignores it and the
config.properties
file which appears in the container is the default one. If so, how do I make my docker container use the correctconfig.properties
file? I thought including it in--extra-files
should do the trick but it seems this doesn't work...
You can use the volume flag while starting the docker container to map/overwrite the inbuilt config.properties file with your local config.properties file:
-v <path_to_local_config_file>:/home/model-server/config.properties
@harshbafna
@harshbafna Interesting, how can it work with
false
if transformers is not included? (Unless you already have transformers installed locally)I should have mentioned that I manually installed transformers before setting the parameter to false :-)
Yes I'm running it on Windows. Is it possible that without the requirements file it worked on Windows, and adding it stops the model from working on Windows? If it's a feature of v0.2.0 which was not tested on Windows it might possibly be the cause. Trying it on WSL seems to be successful.
You can try it on latest master which is now supported on Windows-10 PRO as well.
Is the
config.properties
file required in--extra-files
when creating the mar file?
config.properties
is used for configuring TorchServe and has no impact on model-archive creation. A mar file is a simple archive file (zip) to package all model-specific files.
- Following the previous question, it seems including it there does nothing. When deploying the model on a docker container - the model ignores it and the
config.properties
file which appears in the container is the default one. If so, how do I make my docker container use the correctconfig.properties
file? I thought including it in--extra-files
should do the trick but it seems this doesn't work...You can use the volume flag while starting the docker container to map/overwrite the inbuilt config.properties file with your local config.properties file:
-v <path_to_local_config_file>:/home/model-server/config.properties
Thanks a lot! When you say the latest master supports Windows 10, can I use pip to upgrade my torchserve to the latest master? Or it it not out through pip yet? If not, how do I replace the pip version with the master version?
After trying on WSL and finding out it worked I returned to my original task of making the model run on a docker container, this time sharing the config.properties
file with the docker. However it seems I am now encountering a different error:
2020-12-13 13:59:21,375 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.2.0
TS Home: /home/venv/lib/python3.6/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 8
Max heap size: 3168 M
Python executable: /home/venv/bin/python3
Config file: config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/model-server/model-store
Initial Models: DocTag=DocTag.mar
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 8
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
2020-12-13 13:59:21,383 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: DocTag.mar
2020-12-13 14:01:18,588 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag 9da3267a341646a48c12b7224a96f3bc
2020-12-13 14:01:18,598 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model DocTag
2020-12-13 14:01:18,598 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model DocTag
2020-12-13 14:01:18,598 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model DocTag loaded.
2020-12-13 14:01:19,397 [WARN ] main org.pytorch.serve.ModelServer - Failed to load model: DocTag.mar
org.pytorch.serve.archive.ModelException: Custom pip package installation failed for DocTag
at org.pytorch.serve.wlm.ModelManager.setupModelDependencies(ModelManager.java:190)
at org.pytorch.serve.wlm.ModelManager.registerModel(ModelManager.java:125)
at org.pytorch.serve.ModelServer.initModelStore(ModelServer.java:213)
at org.pytorch.serve.ModelServer.start(ModelServer.java:308)
at org.pytorch.serve.ModelServer.startAndWait(ModelServer.java:104)
at org.pytorch.serve.ModelServer.main(ModelServer.java:85)
2020-12-13 14:01:19,405 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2020-12-13 14:01:19,479 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2020-12-13 14:01:19,480 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2020-12-13 14:01:19,482 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2020-12-13 14:01:19,482 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2020-12-13 14:01:19,483 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
Model server started.
2020-12-13 14:01:19,581 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,582 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:233.84978485107422|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,583 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:4.31707763671875|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,583 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:1.8|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,583 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:11586.375|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,584 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:565.65234375|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:01:19,584 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:8.5|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868079
2020-12-13 14:02:19,602 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,602 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:233.84978103637695|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,603 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:4.317081451416016|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,603 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:1.8|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,603 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:11584.2265625|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,604 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:567.87109375|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:02:19,604 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:8.6|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868139
2020-12-13 14:03:19,611 [INFO ] pool-2-thread-2 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,612 [INFO ] pool-2-thread-2 TS_METRICS - DiskAvailable.Gigabytes:233.8497772216797|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,612 [INFO ] pool-2-thread-2 TS_METRICS - DiskUsage.Gigabytes:4.317085266113281|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,613 [INFO ] pool-2-thread-2 TS_METRICS - DiskUtilization.Percent:1.8|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,613 [INFO ] pool-2-thread-2 TS_METRICS - MemoryAvailable.Megabytes:11583.58203125|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,613 [INFO ] pool-2-thread-2 TS_METRICS - MemoryUsed.Megabytes:571.19921875|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
2020-12-13 14:03:19,613 [INFO ] pool-2-thread-2 TS_METRICS - MemoryUtilization.Percent:8.6|#Level:Host|#hostname:ce0cdb450310,timestamp:1607868199
It appears there is a problem installing the transformers package on the docker? What might be the cause for that? I'm sorry it seems like a never ending thread XD I just need to get it to work with docker, which was my original purpose for the post.
Thanks!
Thanks a lot! When you say the latest master supports Windows 10, can I use pip to upgrade my torchserve to the latest master? Or it it not out through pip yet? If not, how do I replace the pip version with the master version?
It is not able on PyPi yet, we are in the process to make an official 0.3.0 release, which will have support for Windows 10 Pro. Meanwhile, you can install from source.
After trying on WSL and finding out it worked I returned to my original task of making the model run on a docker container, this time sharing the
config.properties
file with the docker. However it seems I am now encountering a different error:
This was a bug in the Docker image in the last release and has been fixed in the latest master. You can create a local docker image from the source. Refer the DEVELOPER ENVIRONMENT IMAGES
section in TorchServe docker documentation
I'm sorry it seems like a never ending thread XD I just need to get it to work with docker, which was my original purpose for the post.
No problem. We are here to help :-)
@harshbafna I am trying to install from source following the windows instructions in the link you provided. However, when I execute the python .\ts_scripts\install_from_src.py
command, it fails at building the wheels for torchserve with the following error output:
(torchserve) D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve>python .\ts_scripts\install_from_src.py
** On entry to DGEBAL parameter number 3 had an illegal value
** On entry to DGEHRD parameter number 2 had an illegal value
** On entry to DORGHR DORGQR parameter number 2 had an illegal value
** On entry to DHSEQR parameter number 4 had an illegal value
------------------------------------------------------------------------------------------
Environment headers
------------------------------------------------------------------------------------------
Torchserve branch: master
torchserve==0.3.0
torch-model-archiver==0.2.1
Python version: 3.8 (64-bit runtime)
Python executable: D:\Programming\anaconda3\envs\torchserve\python.exe
Versions of relevant python libraries:
numpy==1.19.4
torch==1.7.1+cu110
torch-model-archiver==0.2.1b20201216
torchaudio==0.7.2
torchtext==0.8.1
torchvision==0.8.2+cu110
torch==1.7.1+cu110
torchtext==0.8.1
torchvision==0.8.2+cu110
torchaudio==0.7.2
Java Version:
java 11.0.9 2020-10-20 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.9+7-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.9+7-LTS, mixed mode)
OS: Microsoft Windows 10 Pro
GCC version: N/A
Clang version: N/A
CMake version: N/A
## Uninstall existing torchserve and model archiver
usage: conda-script.py [-h] [-V] command ...
conda is a tool for managing and deploying applications, environments and packages.
Options:
positional arguments:
command
clean Remove unused packages and caches.
compare Compare packages between conda environments.
config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the
user .condarc file (C:\Users\עמרי\.condarc) by default.
create Create a new conda environment from a list of specified packages.
help Displays a list of available conda commands and their help strings.
info Display information about current conda install.
init Initialize conda for shell interaction. [Experimental]
install Installs a list of packages into a specified conda environment.
list List linked packages in a conda environment.
package Low-level conda package utility. (EXPERIMENTAL)
remove Remove a list of packages from a specified conda environment.
uninstall Alias for conda remove.
run Run an executable in a conda environment. [Experimental]
search Search for packages and display associated information. The input is a MatchSpec, a query language
for conda packages. See examples below.
update Updates conda packages to the latest compatible version.
upgrade Alias for conda update.
optional arguments:
-h, --help Show this help message and exit.
-V, --version Show the conda version number and exit.
conda commands available from other packages:
build
convert
debug
develop
env
index
inspect
metapackage
render
skeleton
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: conda uninstall -y torchserve torch-model-archiver
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are missing from the target environment:
- torch-model-archiver
- torchserve
## Install torch-model-archiver from source
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: pip install model-archiver/.
Processing d:\programming\jetbrains\pycharmprojects\documenttagger\serve\model-archiver
Requirement already satisfied: future in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torch-model-archiver==0.2.1b20201216) (0.18.2)
Requirement already satisfied: enum-compat in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torch-model-archiver==0.2.1b20201216) (0.0.3)
Building wheels for collected packages: torch-model-archiver
Building wheel for torch-model-archiver (setup.py) ... done
Created wheel for torch-model-archiver: filename=torch_model_archiver-0.2.1b20201216-py3-none-any.whl size=14361 sha256=3b030c5a417406638b370c349434669234b5a2cdf6a6602fa624c991fe5e28bc
Stored in directory: c:\users\עמרי\appdata\local\pip\cache\wheels\1f\2c\89\7ec0bb3e00f1f5762eee7ec4f671de2ae94a1ff2e55b5e5f9a
Successfully built torch-model-archiver
Installing collected packages: torch-model-archiver
Attempting uninstall: torch-model-archiver
Found existing installation: torch-model-archiver 0.2.1b20201216
Uninstalling torch-model-archiver-0.2.1b20201216:
Successfully uninstalled torch-model-archiver-0.2.1b20201216
Successfully installed torch-model-archiver-0.2.1b20201216
## Install torchserve from source
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: pip install .
Processing d:\programming\jetbrains\pycharmprojects\documenttagger\serve
Requirement already satisfied: Pillow in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201216) (8.0.1)
Requirement already satisfied: psutil in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201216) (5.7.3)
Requirement already satisfied: future in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201216) (0.18.2)
Requirement already satisfied: packaging in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201216) (20.8)
Requirement already satisfied: pyparsing>=2.0.2 in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from packaging->torchserve==0.3.0b20201216) (2.4.7)
Building wheels for collected packages: torchserve
Building wheel for torchserve (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\עמרי\AppData\Local\Temp\pip-wheel-kapsk_yr'
cwd: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\
Complete output (71 lines):
running bdist_wheel
running build
running build_py
running build_frontend
> Task :modelarchive:clean UP-TO-DATE
> Task :server:killServer
No server running!
> Task :server:clean UP-TO-DATE
> Task :modelarchive:compileJava
> Task :modelarchive:processResources NO-SOURCE
> Task :modelarchive:classes
> Task :modelarchive:jar
> Task :modelarchive:assemble
> Task :server:extractIncludeProto
> Task :server:extractProto
> Task :server:generateProto FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':server:generateProto'.
> protoc: stdout: . stderr: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\src\main\resources\proto: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\build\extracted-protos\main: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\build\extracted-include-protos\main: warning: directory does not exist.
Could not make proto path relative: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\src\main\resources\proto\inference.proto: No such file or directory
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at https://help.gradle.org
BUILD FAILED in 1s
8 actionable tasks: 6 executed, 2 up-to-date
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 142, in <module>
setup(
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\wheel\bdist_wheel.py", line 299, in run
self.run_command('build')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 103, in run
self.run_command('build_frontend')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 90, in run
subprocess.check_call(build_frontend_command[platform.system()], shell=True)
File "D:\Programming\anaconda3\envs\torchserve\lib\subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '.\frontend\gradlew.bat -p frontend clean assemble' returned non-zero exit status 1.
----------------------------------------
ERROR: Failed building wheel for torchserve
Running setup.py clean for torchserve
Failed to build torchserve
Installing collected packages: torchserve
Running setup.py install for torchserve ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\עמרי\AppData\Local\Temp\pip-record-w6oymrjw\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Programming\anaconda3\envs\torchserve\Include\torchserve'
cwd: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\
Complete output (73 lines):
running install
running build
running build_py
running build_frontend
> Task :modelarchive:clean
> Task :server:killServer
No server running!
> Task :server:clean
> Task :modelarchive:compileJava
> Task :modelarchive:processResources NO-SOURCE
> Task :modelarchive:classes
> Task :modelarchive:jar
> Task :modelarchive:assemble
> Task :server:extractIncludeProto
> Task :server:extractProto
> Task :server:generateProto FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':server:generateProto'.
> protoc: stdout: . stderr: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\src\main\resources\proto: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\build\extracted-protos\main: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\build\extracted-include-protos\main: warning: directory does not exist.
Could not make proto path relative: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\frontend\server\src\main\resources\proto\inference.proto: No such file or directory
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at https://help.gradle.org
BUILD FAILED in 1s
8 actionable tasks: 8 executed
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 142, in <module>
setup(
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\command\install.py", line 61, in run
return orig.install.run(self)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\install.py", line 545, in run
self.run_command('build')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 103, in run
self.run_command('build_frontend')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-5km7ajwi\setup.py", line 90, in run
subprocess.check_call(build_frontend_command[platform.system()], shell=True)
File "D:\Programming\anaconda3\envs\torchserve\lib\subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '.\frontend\gradlew.bat -p frontend clean assemble' returned non-zero exit status 1.
----------------------------------------
ERROR: Command errored out with exit status 1: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-5km7ajwi\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\עמרי\AppData\Local\Temp\pip-record-w6oymrjw\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Programming\anaconda3\envs\torchserve\Include\torchserve' Check the logs for full command output.
## Cleaning build residuals (__pycache__)
## Removing - D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve\ts_scripts\__pycache__
What am I missing here? I am on a brand new environment and have run the dependency installation script before. Thanks!
@OmriPi : Could you try downgrading the NumPy version to 1.19.3 and try running the install_from_src
script again?
This is a window specific issue and has been fixed in PR #885
Root cause can be found here: https://tinyurl.com/y3dm3h86
@harshbafna I did as you recommended, however, I am still getting an error. Here is the error output:
(torchserve) D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve>python .\ts_scripts\install_from_src.py
------------------------------------------------------------------------------------------
Environment headers
------------------------------------------------------------------------------------------
Torchserve branch: master
torchserve==0.3.0
torch-model-archiver==0.2.1
Python version: 3.8 (64-bit runtime)
Python executable: D:\Programming\anaconda3\envs\torchserve\python.exe
Versions of relevant python libraries:
numpy==1.19.3
torch==1.7.1+cu110
torch-model-archiver==0.2.1b20201216
torchaudio==0.7.2
torchtext==0.8.1
torchvision==0.8.2+cu110
torch==1.7.1+cu110
torchtext==0.8.1
torchvision==0.8.2+cu110
torchaudio==0.7.2
Java Version:
java 11.0.9 2020-10-20 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.9+7-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.9+7-LTS, mixed mode)
OS: Microsoft Windows 10 Pro
GCC version: N/A
Clang version: N/A
CMake version: N/A
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: GeForce GTX 980
Nvidia driver version: 460.79
cuDNN version: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\cudnn64_7.dll
## Uninstall existing torchserve and model archiver
usage: conda-script.py [-h] [-V] command ...
conda is a tool for managing and deploying applications, environments and packages.
Options:
positional arguments:
command
clean Remove unused packages and caches.
compare Compare packages between conda environments.
config Modify configuration values in .condarc. This is modeled after the git config command. Writes to the
user .condarc file (C:\Users\עמרי\.condarc) by default.
create Create a new conda environment from a list of specified packages.
help Displays a list of available conda commands and their help strings.
info Display information about current conda install.
init Initialize conda for shell interaction. [Experimental]
install Installs a list of packages into a specified conda environment.
list List linked packages in a conda environment.
package Low-level conda package utility. (EXPERIMENTAL)
remove Remove a list of packages from a specified conda environment.
uninstall Alias for conda remove.
run Run an executable in a conda environment. [Experimental]
search Search for packages and display associated information. The input is a MatchSpec, a query language
for conda packages. See examples below.
update Updates conda packages to the latest compatible version.
upgrade Alias for conda update.
optional arguments:
-h, --help Show this help message and exit.
-V, --version Show the conda version number and exit.
conda commands available from other packages:
build
convert
debug
develop
env
index
inspect
metapackage
render
skeleton
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: conda uninstall -y torchserve torch-model-archiver
Collecting package metadata (repodata.json): done
Solving environment: failed
PackagesNotFoundError: The following packages are missing from the target environment:
- torch-model-archiver
- torchserve
## Install torch-model-archiver from source
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: pip install model-archiver/.
Processing d:\programming\jetbrains\pycharmprojects\documenttagger\serve\model-archiver
Requirement already satisfied: future in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torch-model-archiver==0.2.1b20201217) (0.18.2)
Requirement already satisfied: enum-compat in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torch-model-archiver==0.2.1b20201217) (0.0.3)
Building wheels for collected packages: torch-model-archiver
Building wheel for torch-model-archiver (setup.py) ... done
Created wheel for torch-model-archiver: filename=torch_model_archiver-0.2.1b20201217-py3-none-any.whl size=14362 sha256=fcfee5202261d88afe7644f2d2eb606c8bc112871f2eeec86c8b8ffbbfa55032
Stored in directory: c:\users\עמרי\appdata\local\pip\cache\wheels\1f\2c\89\7ec0bb3e00f1f5762eee7ec4f671de2ae94a1ff2e55b5e5f9a
Successfully built torch-model-archiver
Installing collected packages: torch-model-archiver
Attempting uninstall: torch-model-archiver
Found existing installation: torch-model-archiver 0.2.1b20201216
Uninstalling torch-model-archiver-0.2.1b20201216:
Successfully uninstalled torch-model-archiver-0.2.1b20201216
Successfully installed torch-model-archiver-0.2.1b20201217
## Install torchserve from source
## In directory: D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve | Executing command: pip install .
Processing d:\programming\jetbrains\pycharmprojects\documenttagger\serve
Requirement already satisfied: Pillow in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201217) (8.0.1)
Requirement already satisfied: psutil in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201217) (5.7.3)
Requirement already satisfied: future in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201217) (0.18.2)
Requirement already satisfied: packaging in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from torchserve==0.3.0b20201217) (20.8)
Requirement already satisfied: pyparsing>=2.0.2 in d:\programming\anaconda3\envs\torchserve\lib\site-packages (from packaging->torchserve==0.3.0b20201217) (2.4.7)
Building wheels for collected packages: torchserve
Building wheel for torchserve (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\עמרי\AppData\Local\Temp\pip-wheel-qolmqftb'
cwd: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\
Complete output (72 lines):
running bdist_wheel
running build
running build_py
running build_frontend
Starting a Gradle Daemon, 1 incompatible and 1 stopped Daemons could not be reused, use --status for details
> Task :modelarchive:clean UP-TO-DATE
> Task :server:killServer
No server running!
> Task :server:clean UP-TO-DATE
> Task :modelarchive:compileJava
> Task :modelarchive:processResources NO-SOURCE
> Task :modelarchive:classes
> Task :modelarchive:jar
> Task :modelarchive:assemble
> Task :server:extractIncludeProto
> Task :server:extractProto
> Task :server:generateProto FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':server:generateProto'.
> protoc: stdout: . stderr: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-protos\main: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-include-protos\main: warning: directory does not exist.
Could not make proto path relative: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto\inference.proto: No such file or directory
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at https://help.gradle.org
BUILD FAILED in 15s
8 actionable tasks: 6 executed, 2 up-to-date
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 142, in <module>
setup(
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\wheel\bdist_wheel.py", line 299, in run
self.run_command('build')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 103, in run
self.run_command('build_frontend')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 90, in run
subprocess.check_call(build_frontend_command[platform.system()], shell=True)
File "D:\Programming\anaconda3\envs\torchserve\lib\subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '.\frontend\gradlew.bat -p frontend clean assemble' returned non-zero exit status 1.
----------------------------------------
ERROR: Failed building wheel for torchserve
Running setup.py clean for torchserve
Failed to build torchserve
Installing collected packages: torchserve
Running setup.py install for torchserve ... error
ERROR: Command errored out with exit status 1:
command: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\עמרי\AppData\Local\Temp\pip-record-wb5a5png\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Programming\anaconda3\envs\torchserve\Include\torchserve'
cwd: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\
Complete output (73 lines):
running install
running build
running build_py
running build_frontend
> Task :modelarchive:clean
> Task :server:killServer
No server running!
> Task :server:clean
> Task :modelarchive:compileJava
> Task :modelarchive:processResources NO-SOURCE
> Task :modelarchive:classes
> Task :modelarchive:jar
> Task :modelarchive:assemble
> Task :server:extractIncludeProto
> Task :server:extractProto
> Task :server:generateProto FAILED
FAILURE: Build failed with an exception.
* What went wrong:
Execution failed for task ':server:generateProto'.
> protoc: stdout: . stderr: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-protos\main: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-include-protos\main: warning: directory does not exist.
Could not make proto path relative: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto\inference.proto: No such file or directory
* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
* Get more help at https://help.gradle.org
BUILD FAILED in 1s
8 actionable tasks: 8 executed
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 142, in <module>
setup(
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\__init__.py", line 153, in setup
return distutils.core.setup(**attrs)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\site-packages\setuptools\command\install.py", line 61, in run
return orig.install.run(self)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\install.py", line 545, in run
self.run_command('build')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\command\build.py", line 135, in run
self.run_command(cmd_name)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 103, in run
self.run_command('build_frontend')
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "D:\Programming\anaconda3\envs\torchserve\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\setup.py", line 90, in run
subprocess.check_call(build_frontend_command[platform.system()], shell=True)
File "D:\Programming\anaconda3\envs\torchserve\lib\subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '.\frontend\gradlew.bat -p frontend clean assemble' returned non-zero exit status 1.
----------------------------------------
ERROR: Command errored out with exit status 1: 'D:\Programming\anaconda3\envs\torchserve\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"'; __file__='"'"'C:\\Users\\עמרי\\AppData\\Local\\Temp\\pip-req-build-9c7t39q3\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\עמרי\AppData\Local\Temp\pip-record-wb5a5png\install-record.txt' --single-version-externally-managed --compile --install-headers 'D:\Programming\anaconda3\envs\torchserve\Include\torchserve' Check the logs for full command output.
## Cleaning build residuals (__pycache__)
## Removing - D:\Programming\JetBrains\PycharmProjects\DocumentTagger\serve\ts_scripts\__pycache__
Any clue on what I can do next?
I just wanted to do something simple and it seems like I'm running through every possible bug in the book on the way XD But I'm not going to give up, I WILL get it to work!
Thanks a lot for all the help!
@OmriPi: I tried this on windows 10 pro and was not able to reproduce this.
From the above error, it looks like it is not able to find the proto files while generating the server-side stubs.
Execution failed for task ':server:generateProto'.
> protoc: stdout: . stderr: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-protos\main: warning: directory does not exist.
C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\build\extracted-include-protos\main: warning: directory does not exist.
Could not make proto path relative: C:\Users\עמרי\AppData\Local\Temp\pip-req-build-9c7t39q3\frontend\server\src\main\resources\proto\inference.proto: No such file or directory
Could you please recheck if you followed all the steps documented here
If this still doesn't get solved, please create a separate ticket to track this. If your original query has been solved, can we close this ticket?
@harshbafna Thanks, I am currently running a Windows 10 Pro (Hebrew language if it matters). I do not know what are those proto files, how can I make them appear there? I have rechecked to make sure that I did everything as instructed, the only step I was missing was installing node.js but after having installed it, it made no difference and the wheel building still fails at the same spot. The only difference was that I am using the normal Anaconda prompt to execute the commands instead of the Anaconda Powershell prompt because it seems to be buggged with Hebrew Windows (can't change envs on the PS Anaconda prompt).
I agree we have gone off topic here, however my original query is still unsolved since the end goal is running a docker container with my model, and it seems that this requires installing from source, which fails in my case. Should I create a new issue even though this one hasn't been solved yet?
Also any other ideas on what I could do here to solve this problem? Thanks!
The only difference was that I am using the normal Anaconda prompt to execute the commands instead of the Anaconda Powershell prompt because it seems to be buggged with Hebrew Windows (can't change envs on the PS Anaconda prompt).
For windows, you should use the Powershell prompt started with Administrator permission.
I do not know what are those proto files, how can I make them appear there?
The path exists on the latest master : https://github.com/pytorch/serve/tree/master/frontend/server/src/main/resources/proto
The Gradle tool generates the gRPC server-side stubs using these files.
Also any other ideas on what I could do here to solve this problem?
I am not too sure about the root cause here, probably some file permission issues because you are using the normal shell. Another thing could be Gradle's issue with path resolution on your Windows environment.
I more thing I just noticed is the Unicode character in the temp-path used which could be causing the issue with the gradle build
C:\Users\עמרי\AppData\Local
Could you try this with another user (non-unicoded name) with Admin rights.
Should I create a new issue even though this one hasn't been solved yet?
Install from source is definitely the issue to build docker image from source. However, we have never tested the docker files in a windows environment.
@harshbafna
For windows, you should use the Powershell prompt started with Administrator permission.
Ok I'll try to somehow getting it working if it's mandatory. I did give admin permissions to the normal Anaconda prompt if that matters.
The path exists on the latest master : https://github.com/pytorch/serve/tree/master/frontend/server/src/main/resources/proto The Gradle tool generates the gRPC server-side stubs using these files.
Would manually copying those files to the location mentioned in the error report (where it says they're missing) make sense then? Could this be an ugly but working fix?
I am not too sure about the root cause here, probably some file permission issues because you are using the normal shell. Another thing could be Gradle's issue with path resolution on your Windows environment.
If it has admin privileges I don't think it should run into permission issues. However, path resolutions might make sense in this case.
I more thing I just noticed is the Unicode character in the temp-path used which could be causing the issue with the gradle build C:\Users\עמרי\AppData\Local Could you try this with another user (non-unicoded name) with Admin rights.
Yes I will make one and try that, could be the reason. The never ending nightmare of Hebrew and programming...
Install from source is definitely the issue to build docker image from source. However, we have never tested the docker files in a windows environment.
My problem isn't necessarily building a docker image from source but using custom dependencies in the docker image (they fail installation as previously mentioned). I only wanted to install from source since you said that there is a bug with custom dependency installation in the PyPi version that has been fixed in 0.3.
I'll try using another user and update on the results. Thanks!
@harshbafna Ok, I managed to install from source successfully by creating a new user with an English-only name. Apparently gradle runs into some problems if the path contains unicode characters. I built the docker image (with gpu support from master branch) according to the instructions in the link you gave. I'm getting the following output:
(torchserve) D:\Programming\JetBrains\PycharmProjects\DocumentTagger>docker run --rm -it -p 8080:8080 -p 8081:8081 -v %cd%/model_store:/home/model-server/model-store -v %cd%/config.properties:/home/model-server/config.properties omr
ipi/torchserve:0.3 torchserve --start --model-store model-store --models DocTag=DocTag.mar
2020-12-20 17:32:41,459 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.3.0
TS Home: /usr/local/lib/python3.6/dist-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Number of GPUs: 0
Number of CPUs: 8
Max heap size: 3168 M
Python executable: /usr/bin/python3
Config file: config.properties
Inference address: http://127.0.0.1:8080
Management address: http://127.0.0.1:8081
Metrics address: http://127.0.0.1:8082
Model Store: /home/model-server/model-store
Initial Models: DocTag=DocTag.mar
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 0
Netty client threads: 0
Default workers per model: 8
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Prefer direct buffer: false
Allowed Urls: [file://.*|http(s)?://.*]
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
2020-12-20 17:32:41,465 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: DocTag.mar
2020-12-20 17:34:59,602 [INFO ] main org.pytorch.serve.archive.ModelArchive - eTag c6ebbedd0b304670b803e929a57bdc34
2020-12-20 17:34:59,614 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model DocTag
2020-12-20 17:34:59,614 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model DocTag
2020-12-20 17:34:59,614 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model DocTag loaded.
2020-12-20 17:35:37,956 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: DocTag, count: 8
2020-12-20 17:35:37,996 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2020-12-20 17:35:38,155 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9000
2020-12-20 17:35:38,155 [INFO ] W-9006-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9006
2020-12-20 17:35:38,155 [INFO ] W-9005-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9005
2020-12-20 17:35:38,155 [INFO ] W-9004-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9004
2020-12-20 17:35:38,155 [INFO ] W-9002-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9002
2020-12-20 17:35:38,170 [INFO ] W-9006-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]60
2020-12-20 17:35:38,170 [INFO ] W-9005-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]53
2020-12-20 17:35:38,171 [INFO ] W-9006-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,171 [INFO ] W-9005-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,171 [INFO ] W-9006-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,172 [INFO ] W-9002-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]52
2020-12-20 17:35:38,172 [INFO ] W-9005-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,171 [INFO ] W-9004-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]54
2020-12-20 17:35:38,171 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]49
2020-12-20 17:35:38,175 [INFO ] W-9002-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,176 [DEBUG] W-9005-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9005-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,178 [DEBUG] W-9002-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,178 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,178 [DEBUG] W-9004-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9004-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,176 [DEBUG] W-9006-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9006-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,178 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,179 [INFO ] W-9002-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,182 [INFO ] W-9003-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9003
2020-12-20 17:35:38,182 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,178 [INFO ] W-9004-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,182 [INFO ] W-9004-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,184 [INFO ] W-9001-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9001
2020-12-20 17:35:38,185 [INFO ] W-9003-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]56
2020-12-20 17:35:38,185 [INFO ] W-9003-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,185 [INFO ] W-9003-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,186 [DEBUG] W-9003-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,188 [INFO ] W-9001-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]51
2020-12-20 17:35:38,189 [INFO ] W-9001-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,189 [DEBUG] W-9001-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,189 [INFO ] W-9001-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,191 [INFO ] W-9007-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.ts.sock.9007
2020-12-20 17:35:38,193 [INFO ] W-9007-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - [PID]64
2020-12-20 17:35:38,194 [INFO ] W-9007-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Torch worker started.
2020-12-20 17:35:38,194 [DEBUG] W-9007-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9007-DocTag_1.0 State change null -> WORKER_STARTED
2020-12-20 17:35:38,194 [INFO ] W-9007-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Python runtime: 3.6.9
2020-12-20 17:35:38,203 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9000
2020-12-20 17:35:38,204 [INFO ] W-9004-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9004
2020-12-20 17:35:38,204 [INFO ] W-9007-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9007
2020-12-20 17:35:38,204 [INFO ] W-9006-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9006
2020-12-20 17:35:38,203 [INFO ] W-9001-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9001
2020-12-20 17:35:38,203 [INFO ] W-9003-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9003
2020-12-20 17:35:38,204 [INFO ] W-9005-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9005
2020-12-20 17:35:38,204 [INFO ] W-9002-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.ts.sock.9002
2020-12-20 17:35:38,275 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://127.0.0.1:8080
2020-12-20 17:35:38,275 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2020-12-20 17:35:38,278 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://127.0.0.1:8081
2020-12-20 17:35:38,278 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2020-12-20 17:35:38,283 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://127.0.0.1:8082
2020-12-20 17:35:38,288 [INFO ] W-9002-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9002.
2020-12-20 17:35:38,288 [INFO ] W-9000-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9000.
2020-12-20 17:35:38,288 [INFO ] W-9006-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9006.
2020-12-20 17:35:38,288 [INFO ] W-9007-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9007.
2020-12-20 17:35:38,290 [INFO ] W-9004-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9004.
2020-12-20 17:35:38,292 [INFO ] W-9005-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9005.
2020-12-20 17:35:38,297 [INFO ] W-9003-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9003.
2020-12-20 17:35:38,300 [INFO ] W-9001-DocTag_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.ts.sock.9001.
Model server started.
2020-12-20 17:35:39,041 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,043 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:228.7512969970703|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,044 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:9.415565490722656|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,045 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:4.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,046 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:10829.07421875|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,047 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:1318.953125|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,048 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:14.5|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485739
2020-12-20 17:35:39,775 [WARN ] W-9004-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,775 [WARN ] W-9004-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,814 [WARN ] W-9002-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,815 [WARN ] W-9002-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,824 [WARN ] W-9005-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,824 [WARN ] W-9003-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,824 [WARN ] W-9005-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,824 [WARN ] W-9003-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,825 [WARN ] W-9007-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,825 [WARN ] W-9007-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,862 [WARN ] W-9006-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,863 [WARN ] W-9006-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,872 [WARN ] W-9000-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,872 [WARN ] W-9000-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:35:39,891 [WARN ] W-9001-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - /usr/local/lib/python3.6/dist-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your syst
em. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /pytorch/c10/cuda/CUDAFunctions.cpp:100.)
2020-12-20 17:35:39,892 [WARN ] W-9001-DocTag_1.0-stderr org.pytorch.serve.wlm.WorkerLifeCycle - return torch._C._cuda_getDeviceCount() > 0
2020-12-20 17:36:07,312 [INFO ] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28895
2020-12-20 17:36:07,312 [INFO ] W-9002-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28914
2020-12-20 17:36:07,312 [INFO ] W-9004-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28901
2020-12-20 17:36:07,312 [INFO ] W-9003-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28914
2020-12-20 17:36:07,312 [INFO ] W-9005-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28914
2020-12-20 17:36:07,312 [INFO ] W-9006-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28926
2020-12-20 17:36:07,312 [INFO ] W-9001-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28895
2020-12-20 17:36:07,312 [INFO ] W-9007-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - Backend response time: 28919
2020-12-20 17:36:07,338 [DEBUG] W-9004-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9004-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,338 [DEBUG] W-9007-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9007-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,338 [DEBUG] W-9001-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,338 [DEBUG] W-9000-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,338 [DEBUG] W-9003-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,338 [DEBUG] W-9005-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9005-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,339 [DEBUG] W-9002-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,343 [DEBUG] W-9006-DocTag_1.0 org.pytorch.serve.wlm.WorkerThread - W-9006-DocTag_1.0 State change WORKER_STARTED -> WORKER_MODEL_LOADED
2020-12-20 17:36:07,341 [INFO ] W-9004-DocTag_1.0 TS_METRICS - W-9004-DocTag_1.0.ms:29370|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,345 [INFO ] W-9006-DocTag_1.0 TS_METRICS - W-9006-DocTag_1.0.ms:29376|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9000-DocTag_1.0 TS_METRICS - W-9000-DocTag_1.0.ms:29376|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9007-DocTag_1.0 TS_METRICS - W-9007-DocTag_1.0.ms:29367|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9003-DocTag_1.0 TS_METRICS - W-9003-DocTag_1.0.ms:29372|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9005-DocTag_1.0 TS_METRICS - W-9005-DocTag_1.0.ms:29371|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9002-DocTag_1.0 TS_METRICS - W-9002-DocTag_1.0.ms:29372|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,341 [INFO ] W-9001-DocTag_1.0 TS_METRICS - W-9001-DocTag_1.0.ms:29373|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485767
2020-12-20 17:36:07,351 [INFO ] W-9002-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:135|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9003-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:134|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9000-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:153|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9007-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:129|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9005-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:136|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9006-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:124|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9001-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:152|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:07,351 [INFO ] W-9004-DocTag_1.0 TS_METRICS - WorkerThreadTime.ms:148|#Level:Host|#hostname:e53a17aeab56,timestamp:null
2020-12-20 17:36:38,827 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,827 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:228.75127029418945|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,827 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:9.415592193603516|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,828 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:4.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,828 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:7079.296875|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,829 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:5050.75390625|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:36:38,830 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:44.1|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485798
2020-12-20 17:37:38,813 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,813 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:228.75127029418945|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,814 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:9.415592193603516|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,815 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:4.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,815 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:7079.921875|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,815 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:5050.2734375|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:37:38,815 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:44.1|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485858
2020-12-20 17:38:38,805 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,806 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:228.75126266479492|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,808 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:9.415599822998047|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,809 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:4.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,809 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:7078.80078125|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,809 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:5051.53125|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:38:38,809 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:44.1|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485918
2020-12-20 17:39:38,834 [INFO ] pool-2-thread-1 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,834 [INFO ] pool-2-thread-1 TS_METRICS - DiskAvailable.Gigabytes:228.75126266479492|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,835 [INFO ] pool-2-thread-1 TS_METRICS - DiskUsage.Gigabytes:9.415599822998047|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,837 [INFO ] pool-2-thread-1 TS_METRICS - DiskUtilization.Percent:4.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,837 [INFO ] pool-2-thread-1 TS_METRICS - MemoryAvailable.Megabytes:7078.46484375|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,837 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUsed.Megabytes:5051.8046875|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:39:38,837 [INFO ] pool-2-thread-1 TS_METRICS - MemoryUtilization.Percent:44.1|#Level:Host|#hostname:e53a17aeab56,timestamp:1608485978
2020-12-20 17:40:38,809 [INFO ] pool-2-thread-2 TS_METRICS - CPUUtilization.Percent:0.0|#Level:Host|#hostname:e53a17aeab56,timestamp:1608486038
2020-12-20 17:40:38,810 [INFO ] pool-2-thread-2 TS_METRICS - DiskAvailable.Gigabytes:228.75126266479492|#Level:Host|#hostname:e53a17aeab56,timestamp:1608486038
It looks like everything is in order except for that cuda error, which seems to appear because docker can't use my gpu. It's not a big deal at the moment, so I'll fix it later. The question I have is will this error prevent the container from working? Because when I try to test the container by sending it text files I'm getting empty responses:
PS C:\Users\עמרי\Desktop> curl.exe -X POST localhost:8080/predictions/DocTag -T example.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 4432 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (52) Empty reply from server
This exact command worked previously to at least get a response from the server, but this time the response is empty. I also do not see any log on the container side indicating it has received any incoming request. Is this happening because of the cuda error that prevents everything from working? Or is something in my container broken?
I built the container with the command ./build_image.sh -bt dev -g
, and from the server output looking normal it seems to have worked so I don't know why it isn't at least responding.
Thanks!
It looks like everything is in order except for that cuda error, which seems to appear because docker can't use my gpu. It's not a big deal at the moment, so I'll fix it later.
You can use the --gpus all
flag while starting the docker container to allow access to GPUs.
This exact command worked previously to at least get a response from the server, but this time the response is empty. I also do not see any log on the container side indicating it has received any incoming request. Is this happening because of the CUDA error that prevents everything from working? Or is something in my container broken?
You have built the container correctly and TorchServe has started normally as well. The CUDA-related messages are just warning and not errors and starting the container with the above flag should resolve that problem as well.
The error generally means there is some proxy server or firewall running on your machine and as a result, the requests are getting timed-out.
Closing due to no activity. Please re-open if still facing any issue with docker and/or install from source.
After getting the model to work locally (i.e. running torchserve directly on my machine without using docker), I'm now facing an issue when trying to serve the model through the torchserve docker container. The command I use to run the docker container is:
docker run --rm -it -p 8080:8080 -p 8081:8081 -v %cd%/model_store:/home/model-server/model-store pytorch/torchserve:latest torchserve --start --model-store model-store --models DocTag=DocTag.mar
And I'm getting the following error log:
It seems like after the module is successfully loaded and the server is started something kills the workers for some reason. This is the exact same .mar file that works when running torchserve locally. I'm using Docker Desktop for windows with the official torchserve docker container. When using the CLI for the docker container I can see that the .mar file is properly shared (appears in /model-store). I've tried launching torchserve docker without registering any model, and then using the management API to register the model. The model was successfully loaded, without any workers. When using the scale workers command to add a worker this error happens again. I don't know if it makes a difference but might help in pinpointing the source of the problem.
I've been trying to figure out the problem here for a few days now with no success, can anyone help me here?
Thanks!