all_tokens = []
for msg in message:
msg_tokens = []
for t in msg.get("tokens"):
text = self._replace_number_blank(t.text)
if text != '':
msg_tokens.append(text)
a = str(msg_tokens)
a = a.replace('[', '')
a = a.replace(']', '')
a = a.replace(',', '')
a = a.replace('\'', '')
a = a.replace(' ', '')
all_tokens.append(list(a))
#all_tokens.append(a)
logger.info("bert vectors featurizer finished")
try:
bert_embedding = self.bc.encode(all_tokens, is_tokenized=True)
bert_embedding = np.squeeze(bert_embedding)
Then this issue shows up:
I want to increase the concurrency performance in the production environments,
and in production , the user input one sentence at once .
and I used jmeter to test the concurrency is at about 10 per second
and when it up to 20 per second
Prerequisites
bert-as-service
?README.md
?README.md
?System information
bert-as-service
version: 1.9.1Description
I'm using this command to start the server:
and calling the server via:
Then this issue shows up:
I want to increase the concurrency performance in the production environments, and in production , the user input one sentence at once . and I used jmeter to test the concurrency is at about 10 per second and when it up to 20 per second
here:
bert_embedding = self.bc.encode(all_tokens, is_tokenized=True)
will block and cost a lot of time
how could I improve the concurrency up to 50 a second?
should I use this parameter? in the server side config? -http_max_connect 50
Thanks weizhen
...