jina-ai / clip-as-service

🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
https://clip-as-service.jina.ai
Other
12.43k stars 2.07k forks source link

docker image, show tokens to client not being honored. #495

Open michael-newsrx opened 4 years ago

michael-newsrx commented 4 years ago

[x] Are you running the latest bert-as-service? [x] Did you follow the installation and the usage instructions in README.md? [x] Did you check the FAQ list in README.md? [x] Did you perform a cursory search on existing issues?

System information

docker with custom entrypoint:

#!/bin/sh
bert-serving-start -model_dir /model -max_seq_len NONE -show_tokens_to_client -http_port 8080 -num_worker=$1

Description

bert-serving-start -model_dir /model -max_seq_len NONE -show_tokens_to_client -http_port 8080 -num_worker=$1

and calling the server via http json post to patched http.py:

curl -X POST http://192.168.1.129:8080/encode   -H 'content-type: application/json'   -d '{"id": 123,"texts": ["hello world"], "is_tokenized": false}'

http.py is patched as follows:

return {'patched':bool(True), 'id': data['id'],
                        'result': bc.encode(data['texts'],show_tokens=True, is_tokenized=bool(
                            data['is_tokenized']) if 'is_tokenized' in data else False)}

Then this issue shows up:

No tokenization in the response. Log file shows:

/usr/local/lib/python3.5/dist-packages/bert_serving/client/__init__.py:308: UserWarning: "show_tokens=True", but the server does not support showing tokenization info to clients.
here is what you can do:
- start a new server with "bert-serving-start -show_tokens_to_client ..."
- or, use "encode(show_tokens=False)"
  warnings.warn('"show_tokens=True", but the server does not support showing tokenization info to clients.\n'
192.168.1.49 - - [19/Dec/2019 19:38:22] "POST /encode HTTP/1.1" 200 -

However, http://192.168.1.129:8080/status/server shows:

"show_tokens_to_client":true

...

michael-newsrx commented 4 years ago

Full output from: http://192.168.1.129:8080/status/server

{
  "ckpt_name": "bert_model.ckpt",
  "client": "cd350caa-a83e-46c5-82e4-cbe06f9f3a7a",
  "config_name": "bert_config.json",
  "cors": "*",
  "cpu": false,
  "device_map": [],
  "do_lower_case": true,
  "fixed_embed_length": false,
  "fp16": false,
  "gpu_memory_fraction": 0.5,
  "graph_tmp_dir": null,
  "http_max_connect": 10,
  "http_port": 8080,
  "mask_cls_sep": false,
  "max_batch_size": 256,
  "max_seq_len": null,
  "model_dir": "/model",
  "no_position_embeddings": false,
  "no_special_token": false,
  "num_concurrent_socket": 8,
  "num_process": 3,
  "num_worker": 1,
  "pooling_layer": [
    -2
  ],
  "pooling_strategy": 2,
  "port": 5555,
  "port_out": 5556,
  "prefetch_size": 10,
  "priority_batch_size": 16,
  "python_version": "3.5.2 (default, Nov 23 2017, 16:37:01) \n[GCC 5.4.0 20160609]",
  "pyzmq_version": "17.1.2",
  "server_current_time": "2019-12-19 19:40:43.457178",
  "server_start_time": "2019-12-19 19:37:52.060314",
  "server_version": "1.9.9",
  "show_tokens_to_client": true,
  "statistic": {
    "avg_last_two_interval": 16.154498574964236,
    "avg_request_per_client": 2,
    "avg_request_per_second": 0.06190226179782333,
    "avg_size_per_request": 1,
    "max_last_two_interval": 16.154498574964236,
    "max_request_per_client": 2,
    "max_request_per_second": 0.06190226179782333,
    "max_size_per_request": 1,
    "min_last_two_interval": 16.154498574964236,
    "min_request_per_client": 2,
    "min_request_per_second": 0.06190226179782333,
    "min_size_per_request": 1,
    "num_active_client": 1,
    "num_data_request": 1,
    "num_max_last_two_interval": 1,
    "num_max_request_per_client": 1,
    "num_max_request_per_second": 1,
    "num_max_size_per_request": 1,
    "num_min_last_two_interval": 1,
    "num_min_request_per_client": 1,
    "num_min_request_per_second": 1,
    "num_min_size_per_request": 1,
    "num_sys_request": 1,
    "num_total_client": 1,
    "num_total_request": 2,
    "num_total_seq": 1
  },
  "status": 200,
  "tensorflow_version": [
    "1",
    "12",
    "0"
  ],
  "tuned_model_dir": null,
  "ventilator -> worker": [
    "ipc://tmpQH1gxr/socket",
    "ipc://tmpI4oRMI/socket",
    "ipc://tmpCAYt2Z/socket",
    "ipc://tmpqZx8hh/socket",
    "ipc://tmpgp9Oxy/socket",
    "ipc://tmpqRQxNP/socket",
    "ipc://tmpuIEi36/socket",
    "ipc://tmpM4Kgjo/socket"
  ],
  "ventilator <-> sink": "ipc://tmp05TIha/socket",
  "verbose": false,
  "worker -> sink": "ipc://tmppA4sds/socket",
  "xla": false,
  "zmq_version": "4.2.5"
}
michael-newsrx commented 4 years ago

ps listing from inside the container clearly shows "-show_tokens_to_client" as CLI option:

/opt/docker$ docker exec -it bert-as-service /bin/bash
root@2c6ec6089cde:/app# ps axww      
  PID TTY      STAT   TIME COMMAND
    1 pts/0    Ss+    0:00 /bin/sh /app/entrypoint.sh 1
    6 pts/0    Sl+    0:05 /usr/bin/python3 /usr/local/bin/bert-serving-start -model_dir /model -max_seq_len NONE -show_tokens_to_client -http_port 8080 -num_worker=1