Open kf89 opened 6 years ago
Do you know why my codes run faster? Let me explain. Here's my codes
threads = []
for i in start_stop_line_list:
#t = StoppableThread(target=worker_import_to_es_for_threading, args=(data, i['start'], i['stop']))
t = threading.Thread(target=worker_import_to_es_for_threading,
args=(data, i['start'], i['stop'], Elasticsearch([bulk], verify_certs=True), index, doc_type, )
)
threads.append(t)
t.start()
t.join()
As you can see I created a new threading.Thread
object in every iteration of the loop. In fact these t
objects are not combined. so t.start()
triggers the single thread to claim it is ready to process in each loop while the program goes on to t.join()
. This will make the single t
thread start no matter other threads are created or started. Now the t
thread runs in RAM and the naming space of t
(a new thread) will spawn without being effected by the previous t
thread.
I had thought of your assumption. So the code was like this. By the way, I used import threading
not import thread
.
Hope this makes you understand it better.
From your code I have seen that you call thread.join after calling thread.start() and inside the same loop.But from here you can see that the idiomatic way is to call thread.join in another loop.
for t in ts: t.join()
is generally the idiomatic way to start a small number of threads. Doing .join means that your main thread waits until the given thread finishes before proceeding in execution. You generally do this after you've started all of the threads.I have used both ways, and it's strange that the program with idiomatic way of using thread.join() takes more execution time than yours.But maybe it is better in memory consumption which i haven't tested yet.