embedding is doing only one worker and it is so slow

ntaherkhani commented 3 years ago

Hi, I am using bert_asservice 10. I start my bert server as: args = get_args_parser().parse_args(['-model_dir', str(model_path), '-num_worker', str(num_workers), '-port', str(in_port), '-port_out', str(out_port) ]) server = BertServer(args) server.max_seq_len=512 server.client_batch_size=4096 server.num_client=1 server.num_worker=8 server.start()_

I am sending list of sentences for embedding to bert client.

**_def create_bert_client_instance(): return BertClient(check_length=False)

__def embedding_sentences( doc, bert_client):

fv= np.zeros((1,1024),dtype = float)
sentences=nltk.tokenize.sent_tokenize(doc)

if (len(sentences )> 0):

    fv=bert_client.encode(sentences)

return fv_**

I am using 8 GPUs. but it is so slow and when I am checking dask dashboard it is showing only one worker is working.

what should I do for solving this issue? and what is wrong in my setting? Also, as you can see in the attached image, each worker is using only 1G of GPU memory. However, I have 94G available memory for each GPU. How I can increase the memory usage for workers.

Thanks

ntaherkhani commented 3 years ago

any clue @hanxiao @jacobdevlin-google @abhishekraok @GabrielBianconi @cbockman @Jhangsy @DmitryKey ?

DmitryKey commented 3 years ago

in 1.10.0 there is no num_client. On your dashboard screenshot I can see 8 processes?

jina-ai / clip-as-service

embedding is doing only one worker and it is so slow #620