Open duongkstn opened 1 year ago
Hi @duongkstn , your code is requesting the server sequentially, which means that there is only at most one container processing the request at a time.
You need to create a thread or process pool to call word_segment
concurrently using threading
or multiprocessing
.
Also, you might need to monitor the CPU usage of each container with docker stats
May I ask you a question ? why did you use nginx here ? what are its advantages ?
Because you can only bind 1 container to 1 port in the host. So if you remove Nginx and bind vncorenlp to port 8000, you will get an error when scaling since that port is already allocated. (link)
With Nginx, you only need to bind it to port 8000 and balance the load to all vncorenlp containers, which don't have to bind to any port in the host.
@duydvu I tested again but there are some problems: Here are my 4 methods:
py_vncorenlp
import py_vncorenlp
annotator = py_vncorenlp.VnCoreNLP(annotators=["wseg"], save_dir="..../VnCoreNLP")
times = 1000
kq = []
start = time.time()
for i in range(times):
kq.append(annotator.word_segment(texts[i]))
end = time.time()
Sequentially call your docker service (sudo docker compose up --scale vncorenlp=4
):
from vncorenlp import VnCoreNLP
annotator = VnCoreNLP('http://localhost', 8000)
times = 1000
kq = []
start = time.time()
for i in range(times):
kq.append(annotator.tokenize(texts[i]))
end = time.time()
Concurrently call your docker service (sudo docker compose up --scale vncorenlp=4
):
from vncorenlp import VnCoreNLP
from threading import Thread
annotator = VnCoreNLP('http://localhost', 8000)
times= 1000
threads = [None] * 4 # since your --scale params is 4, so I choose 4 as the number of threads
kq = [None] * times
def call_docker_service(list_texts, result, indices): for j, index in enumerate(indices): result[index] = annotator.tokenize(list_texts[j])
batch_size = 1000 // 4 # 250 samples each threads start = time.time() for i in range(4): _start = i batch_size _end = (i + 1) batch_size threads[i] = Thread(target=call_docker_service, args=(texts[_start: _end], kq, list(range(_start, _end)))) threads[i].start() for i in range(len(threads)): threads[i].join() end = time.time()
4. `VnCoreNLP` but with `threading` (Instead using your docker service, I use command: `vncorenlp -Xmx2g .../VnCoreNLP -p 8012 -a "wseg"`
The client code is same as method 3, but with `annotator = VnCoreNLP('http://localhost', 8012)`
(8012 is just a random number)
And here are my results (`end - start`):
1. solution 1 took 1.5912044048309326 seconds
2. solution 2 took 9.347352981567383 seconds
3. solution 3 took 3.405658006668091 seconds
4. solution 4 took 2.7315216064453125 seconds
- Like you said, "Concurrently" is better than "Sequentially" (3 better than 2). I agree !
- 1 is always the best solution. (Even it is sequential)
- Sometimes 3 better than 4, sometimes 4 better than 3, so I do not know exactly what improvements of your code (method 3) compared to concurrently calling to `VnCoreNLP` (method 4) ?
Please let me know your solution, maybe I am wrong something ? Are there any faster solutions ? Maybe my `threading` code is wrong ?
Thanks
It seems that multiprocessing
is better than threading
. I tested it myself and saw that using many threads will cause a bottleneck since Python has GIL.
Here is my code:
from vncorenlp import VnCoreNLP
from multiprocessing import Pool
import time
annotator = VnCoreNLP('http://localhost', 8000)
n = 10
batch_size = 10000 // n
def call_docker_service(i):
for _ in range(batch_size):
annotator.tokenize('hôm nay tôi đi học')
start = time.time()
with Pool(16) as pool:
pool.map(call_docker_service, range(n))
end = time.time()
print(end - start)
In my setup, using multiprocessing
is 3 times faster than using threading
.
Hi, your above code is ran successfully. but when batch_size = 100000 // n
or batch_size = 1000000 // n
(more zeros). I got the following error:
('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')
. is this bug of nginx
or VnCoreNLP ?
Have you ever met this error ? And let me know how to fix it. Thanks ~!
@duongkstn This is expected when you call too many requests to the server. You will encounter this error more often as the number of processes increases.
Just simply try and catch this error and retry. But when the error occurs too many times, it indicates that you have reached the limit of the server, so try to increase the number of containers or decrease the number of processes.
Hi, I ran your code with 1000 samples from a list of texts (
texts
)and compared the total time with
py_vncorenlp
code:The result I got is that your code is much slower than
py_vncorenlp
version. Am I wrong at something, maybe my testing approach is wrong ? please let me know your solution