Bert server connection problem

I am a newbie and studying on BERT. I tried to run the example as mentioned on the page https://bert-as-service.readthedocs.io/en/latest/section/get-start.html but whenever I give the command the server connects but nothing appears, I wanted to run the example as mentioned in the page but nothing appears on my screen it only says that Ready and listening! but I could not convert my text into vectors, as it appears in first clip, https://raw.githubusercontent.com/hanxiao/bert-as-service/master/.github/demo.gif I can write something but nothing happens. I use this command for pre-trained multi cased python3 /home/ali/.local/bin/bert-serving-start -model_dir ./multi_cased_L-12_H-768_A-12/ -num_worker=4 on my ubuntu 18.04.
System: HP omen 17 Intel core i7, 8750H Nvidia GTX1050 4GB.

whenever I run the command this happens Screenshot from 2019-03-21 21-19-05

at the end of the screenshot I used this code (example), as mentioned in the 2nd link(gif) from bert_serving.client import BertClient bc = BertClient() bc.encode(['First do it', 'then do it right', 'then do it better'])

rephrasing the question: I use this command to connect to Bert server ali@Omen:~/Desktop/Work$ python3 /home/ali/.local/bin/bert-serving-start -model_dir ./multi_cased_L-12_H-768_A-12/ -num_worker=4

and I get this as output (as mentioned in screenshot)

usage: /home/ali/.local/bin/bert-serving-start -model_dir ./multi_cased_L-12_H-768_A-12/ -num_worker=4 ARG VALUE

       ckpt_name = bert_model.ckpt
     config_name = bert_config.json
            cors = *
             cpu = False
      device_map = []

fixed_embed_length = False fp16 = False gpu_memory_fraction = 0.5 graph_tmp_dir = None http_max_connect = 10 http_port = None mask_cls_sep = False max_batch_size = 256 max_seq_len = 25 model_dir = ./multi_cased_L-12_H-768_A-12/ num_worker = 4 pooling_layer = [-2] pooling_strategy = REDUCE_MEAN port = 5555 port_out = 5556 prefetch_size = 10 priority_batch_size = 16 show_tokens_to_client = False tuned_model_dir = None verbose = False xla = False

I:VENTILATOR:[i:i: 66]:freeze, optimize and export graph, could take a while... I:GRAPHOPT:[gra:opt: 52]:model config: ./multi_cased_L-12_H-768_A-12/bert_config.json I:GRAPHOPT:[gra:opt: 55]:checkpoint: ./multi_cased_L-12_H-768_A-12/bert_model.ckpt I:GRAPHOPT:[gra:opt: 59]:build graph...

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue.

I:GRAPHOPT:[gra:opt:128]:load parameters from checkpoint... I:GRAPHOPT:[gra:opt:132]:optimize... I:GRAPHOPT:[gra:opt:140]:freeze... I:GRAPHOPT:[gra:opt:145]:write graph to a tmp file: /tmp/tmp74_auga3 I:VENTILATOR:[i:i: 74]:optimized graph is stored at: /tmp/tmp74_auga3 I:VENTILATOR:[i:_ru:118]:bind all sockets I:VENTILATOR:[__i:_ru:122]:open 8 ventilator-worker sockets I:VENTILATOR:[i:_ru:125]:start the sink I:SINK:[i:_ru:284]:ready I:VENTILATOR:[__i:_ge:202]:get devices W:VENTILATOR:[i:_ge:217]:only 1 out of 1 GPU(s) is available/free, but "-num_worker=4" W:VENTILATOR:[i:_ge:219]:multiple workers will be allocated to one GPU, may not scale well and may raise out-of-memory I:VENTILATOR:[__i:_ge:235]:device map: worker 0 -> gpu 0 worker 1 -> gpu 0 worker 2 -> gpu 0 worker 3 -> gpu 0 I:WORKER-3:[i:_ru:492]:use device gpu: 0, load graph from /tmp/tmp74_auga3 I:WORKER-0:[i:_ru:492]:use device gpu: 0, load graph from /tmp/tmp74_auga3 I:WORKER-2:[i:_ru:492]:use device gpu: 0, load graph from /tmp/tmp74_auga3 I:WORKER-1:[i:_ru:492]:use device gpu: 0, load graph from /tmp/tmp74_auga3 I:WORKER-2:[i:gen:520]:ready and listening! I:WORKER-1:[i:gen:520]:ready and listening! I:WORKER-3:[i:gen:520]:ready and listening! I:WORKER-0:[__i:gen:520]:ready and listening!

google-research / bert

Bert server connection problem #528