Closed zanjabil2502 closed 1 year ago
Hi @zanjabil2502, could you provide more information about your setup?
How are you running your single-process and multi-process experiments? Is it diart.benchmark
or have you implemented your own script?
The slowdown could be coming from sharing the GPU between the two processes.
I found the solution, before i using OnlineSpeakerDiarization as model_pipeline for two process or more process. Now, i must load OnlineSpeakerDiarization in every thread process. Process not slowdown if i use this schema.
def diarization(self,pathaudio):
model_pipeline = OnlineSpeakerDiarization()
audio = FileAudioSource(pathaudio,model_pipeline.config.sample_rate)
inference = RealTimeInference(model_pipeline, audio, show_progress=False, do_plot=False)
prediction = inference()
for c in client:
process = Tread(target=diarization,args=(pathaudio,))
This is new schema from me
Sorry, i have problem again, when i use the new schema like above, in 5 thread, it is safe, process still fast, but when i use 10 thread why the process is so long. The result like this:
When 5 Threads:
Took 0.667 (+/-0.186) seconds/chunk -- ran 60 times
Took 0.671 (+/-0.181) seconds/chunk -- ran 60 times
Took 0.653 (+/-0.179) seconds/chunk -- ran 60 times
Took 0.623 (+/-0.221) seconds/chunk -- ran 60 times
Took 0.646 (+/-0.264) seconds/chunk -- ran 60 times
When 10 Threads:
Took 1.174 (+/-0.439) seconds/chunk -- ran 60 times
Took 1.173 (+/-0.380) seconds/chunk -- ran 60 times
Took 1.216 (+/-0.397) seconds/chunk -- ran 60 times
Took 1.197 (+/-0.322) seconds/chunk -- ran 60 times
Took 1.245 (+/-0.306) seconds/chunk -- ran 60 times
Took 1.217 (+/-0.316) seconds/chunk -- ran 60 times
Took 1.206 (+/-0.318) seconds/chunk -- ran 60 times
Took 1.177 (+/-0.363) seconds/chunk -- ran 60 times
Took 1.257 (+/-0.399) seconds/chunk -- ran 60 times
Took 1.191 (+/-0.395) seconds/chunk -- ran 60 times
I use 5s for the step and 300s audio for testing. what is the solution if like this? do I load models with more than 2 GPUs?
@zanjabil2502 when you run more systems in multithreading, your threads may start competing for the same resources (e.g. RAM/VRAM, because you instantiate new models in each thread). There's also python's GIL that may be bothering there.
I suggest you try multiprocessing instead. This is what I found to be most effective and it's the way I implemented the parallelization of Benchmark
. This may remove the issues with the GIL but not with competing resources though.
I have 300s duration audio, when i use single process, the result like that: Took 0.183 (+/-0.023) seconds/chunk -- ran 59 times with Step is 5 seconc
but when two concurrent process, the result like that: Took 3.367 (+/-2.031) seconds/chunk -- ran 59 times Took 3.627 (+/-1.946) seconds/chunk -- ran 59 times with the same configuration
the program running on GPU. why the process will slowdown?