worker never starts - Githubissues

Kup3a commented 5 years ago

Prerequisites

Please fill in by replacing [ ] with [x].

[x] Are you running the latest bert-as-service?
[x] Did you follow the installation and the usage instructions in README.md?
[x] Did you check the FAQ list in README.md?
[x] Did you perform a cursory search on existing issues?

System information

Some of this information can be collected via this script.

On my local machine everything is alright, but i can`t run bert inside docker.

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macos
TensorFlow installed from (source or binary): docker image tensorflow/tensorflow:1.12.0-py3
TensorFlow version: 1.12.0
Python version: 3.5.2
bert-as-service version: 1.8.1

Description

I'm using this command to start the server:

docker run --rm -it -p 127.0.0.1:8125:8125 test_bert_local

test_bert_local previously built from Dockerfile:

FROM tensorflow/tensorflow:1.12.0-py3

WORKDIR /app
RUN mkdir model
ADD ./model_data ./model
COPY ./ /app
WORKDIR /app
RUN pip install -r requirements.txt
RUN chmod 777 entrypoint.sh
ENTRYPOINT ["/app/entrypoint.sh"]

inside entrypoint.sh

bert-serving-start -cpu -model_dir /app/model -num_worker=1 -ckpt_name=new_model.ckpt -http_port 80

server shows

               ARG   VALUE
__________________________________________________
           ckpt_name = new_model.ckpt
         config_name = bert_config.json
                cors = *
                 cpu = True
          device_map = []
                fp16 = False
 gpu_memory_fraction = 0.5
       graph_tmp_dir = None
    http_max_connect = 10
           http_port = 8125
        mask_cls_sep = False
      max_batch_size = 256
         max_seq_len = 25
           model_dir = /app/model
          num_worker = 1
       pooling_layer = [-2]
    pooling_strategy = REDUCE_MEAN
                port = 5555
            port_out = 5556
       prefetch_size = 10
 priority_batch_size = 16
show_tokens_to_client = False
     tuned_model_dir = None
             verbose = False
                 xla = False

I:VENTILATOR:[__i:__i: 66]:freeze, optimize and export graph, could take a while...
I:GRAPHOPT:[gra:opt: 52]:model config: /app/model/bert_config.json
I:GRAPHOPT:[gra:opt: 55]:checkpoint: /app/model/new_model.ckpt
I:GRAPHOPT:[gra:opt: 59]:build graph...
I:GRAPHOPT:[gra:opt:128]:load parameters from checkpoint...
I:GRAPHOPT:[gra:opt:132]:optimize...
I:GRAPHOPT:[gra:opt:140]:freeze...
I:GRAPHOPT:[gra:opt:145]:write graph to a tmp file: /tmp/tmp9fx4s9b3
I:VENTILATOR:[__i:__i: 74]:optimized graph is stored at: /tmp/tmp9fx4s9b3
I:VENTILATOR:[__i:_ru:106]:bind all sockets
I:VENTILATOR:[__i:_ru:110]:open 8 ventilator-worker sockets
I:VENTILATOR:[__i:_ru:113]:start the sink
I:VENTILATOR:[__i:_ge:188]:get devices
I:VENTILATOR:[__i:_ge:221]:device map: 
    worker  0 -> cpuI:SINK:[__i:_ru:265]:ready

I:VENTILATOR:[__i:_ru:129]:start http proxy
I:WORKER-0:[__i:_ru:456]:use device cpu, load graph from /tmp/tmp9fx4s9b3
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'3b38c252-1da0-4428-820e-0cd4f9c1d650'
I:SINK:[__i:_ru:315]:send config  client b'3b38c252-1da0-4428-820e-0cd4f9c1d650'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'49db4755-88aa-4732-8999-89a3a9956600'
I:SINK:[__i:_ru:315]:send config  client b'49db4755-88aa-4732-8999-89a3a9956600'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'145aaec8-fdd9-4f82-80c9-271538ee8a8c'
I:SINK:[__i:_ru:315]:send config  client b'145aaec8-fdd9-4f82-80c9-271538ee8a8c'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'7a73d1de-9534-4efe-8969-f84e7d8180d5'
I:SINK:[__i:_ru:315]:send config  client b'7a73d1de-9534-4efe-8969-f84e7d8180d5'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'dd2a8536-8de8-4c74-857b-30c307d3674d'
I:SINK:[__i:_ru:315]:send config  client b'dd2a8536-8de8-4c74-857b-30c307d3674d'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'b3721d53-f23b-43f3-a67a-31abdedfad90'
I:SINK:[__i:_ru:315]:send config  client b'b3721d53-f23b-43f3-a67a-31abdedfad90'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'6e848a1d-0e66-4f85-94b3-1e7c5b882a6a'
I:SINK:[__i:_ru:315]:send config  client b'6e848a1d-0e66-4f85-94b3-1e7c5b882a6a'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'146ad7cf-e69b-479d-89de-2d20a74dbd3a'
I:SINK:[__i:_ru:315]:send config  client b'146ad7cf-e69b-479d-89de-2d20a74dbd3a'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'422d9d5f-472c-408b-9686-131032406da9'
I:SINK:[__i:_ru:315]:send config  client b'422d9d5f-472c-408b-9686-131032406da9'
I:VENTILATOR:[__i:_ru:148]:new config request  req id: 1  client: b'35d42882-34ce-448c-a97f-23f2286ec8af'
I:SINK:[__i:_ru:315]:send config  client b'35d42882-34ce-448c-a97f-23f2286ec8af'
 * Serving Flask app "bert_serving.server.http" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://0.0.0.0:8125/ (Press CTRL+C to quit)

server can response to status/server with (via curl)

{
    "ckpt_name": "new_model.ckpt",
    "client": "493b4ede-4533-4088-aaa0-25b97a4a2114",
    "config_name": "bert_config.json",
    "cors": "*",
    "cpu": true,
    "device_map": [],
    "fp16": false,
    "gpu_memory_fraction": 0.5,
    "graph_tmp_dir": null,
    "http_max_connect": 10,
    "http_port": 8125,
    "mask_cls_sep": false,
    "max_batch_size": 256,
    "max_seq_len": 25,
    "model_dir": "/app/model",
    "num_concurrent_socket": 8,
    "num_process": 3,
    "num_worker": 1,
    "pooling_layer": [
        -2
    ],
    "pooling_strategy": 2,
    "port": 5555,
    "port_out": 5556,
    "prefetch_size": 10,
    "priority_batch_size": 16,
    "python_version": "3.5.2 (default, Nov 23 2017, 16:37:01) \n[GCC 5.4.0 20160609]",
    "pyzmq_version": "17.1.2",
    "server_current_time": "2019-02-15 16:15:54.553378",
    "server_start_time": "2019-02-15 16:15:04.162752",
    "server_version": "1.8.1",
    "show_tokens_to_client": false,
    "statistic": {
        "avg_request_per_client": 1.1,
        "max_request_per_client": 2,
        "min_request_per_client": 1,
        "num_active_client": 0,
        "num_data_request": 0,
        "num_max_request_per_client": 1,
        "num_min_request_per_client": 9,
        "num_sys_request": 11,
        "num_total_client": 10,
        "num_total_request": 11,
        "num_total_seq": 0
    },
    "status": 200,
    "tensorflow_version": [
        "1",
        "11",
        "0"
    ],
    "tuned_model_dir": null,
    "ventilator -> worker": [
        "ipc://tmpO0TghK/socket",
        "ipc://tmpRyg76o/socket",
        "ipc://tmpE5RYW3/socket",
        "ipc://tmplOK2MI/socket",
        "ipc://tmpeqF7Cn/socket",
        "ipc://tmpdWAdt2/socket",
        "ipc://tmpyWrkjH/socket",
        "ipc://tmpdXes9l/socket"
    ],
    "ventilator <-> sink": "ipc://tmprSCsr5/socket",
    "verbose": false,
    "worker -> sink": "ipc://tmpyaufsj/socket",
    "xla": false,
    "zmq_version": "4.2.5"
}

but after calling /encode (via curl) i see this

I:PROXY:[htt:enc: 47]:new request from 172.17.0.1
I:VENTILATOR:[__i:_ru:164]:new encode request  req id: 2  size: 1  client: b'35d42882-34ce-448c-a97f-23f2286ec8af'
I:SINK:[__i:_ru:312]:job register  size: 1  job id: b'35d42882-34ce-448c-a97f-23f2286ec8af#2'

and server is never responding. Locally everything is working fine, after about one minute i see 'ready' from WORKER[0] and /encode is responding normally.

Also i noticed, that during local-run bert is actively consuming 1cpu before he writes 'ready'. In docker-case htop shows zero cpu-consuming.

Thank you for reading. I'll add any extra info if needed.

Kup3a commented 5 years ago

UPD -prefetch_size 16 didn`t help

shanebrunette commented 5 years ago

Can report similar issue locally with cpu only version of tensorflow and: tensorflow 1.13.1 bert-as-a-service 1.8.3

INFO:tensorflow:Calling model_fn.
I:VENTILATOR:[__i:_ru:148]:new config request   req id: 1   client: b'f01b84db-4dd7-423e-b1bc-c9ab46a470df'
I:SINK:[__i:_ru:320]:send config    client b'f01b84db-4dd7-423e-b1bc-c9ab46a470df'
I:VENTILATOR:[__i:_ru:164]:new encode request   req id: 2   size: 3 client: b'f01b84db-4dd7-423e-b1bc-c9ab46a470df'
I:SINK:[__i:_ru:317]:job register   size: 3 job id: b'f01b84db-4dd7-423e-b1bc-c9ab46a470df#2'

Nothing seems to happen after this point.

Also max_batch_size 16 did not help

hanxiao commented 5 years ago

how about -prefetch_size 0

hanxiao commented 5 years ago

please refer to https://github.com/hanxiao/bert-as-service/issues/254#issuecomment-473482746

Birendra20743592 commented 5 years ago

@Kup3a Have you found the solution yet ? I am facing the same issue. The server never seems to start in a docker with a HTTP request, However locally it works fine.

@hanxiao do you have the fix for the cpu only version.

Birendra20743592 commented 5 years ago

Using kaggle/python as base image fixed the problem and should work for everyone. But not sure what was the root cause of the issue. It seems zeromq was unable to bind the sockets.

jina-ai / clip-as-service

worker never starts #238

Description