Closed zjd1988 closed 10 months ago
Hi @zjd1988 ,
You are right, there is indeed a TODO comment a couple lines above that one to precisely improve it. Ideally stages should be as small as possible, so the CPU can be captured by a different one.
That said, I considered while writting that part to use rayon for that loop. However, what I still need to figure out is the fact that CPUs are passed to other threads when the one having it is blocked, but in this case, there won't probably be a block, just a task that takes long. Also, the restriction on FPS will still be conditioned by the stage that takes longer, since there is a frame there that must go out. So, correct me if I am wrong, but I think even if you make use of threads there, if stages take too long the only solution would be to improve those stages themselves or add more resources to the device.
@miguelaeh Yes, you are right. I wish pipeless could replace deepstream equivalently.
@zjd1988 I am curious about how did you find that out. Are you suffering any problem with the FPS with some code that you had in deepstream? It would be really helpful if you could elaborate in order to see if we have to review the current approach
@miguelaeh I‘'m afriad there are no actual examples. I am interested in video processing pipeline, heard about this repository from blog a few days ago. I am also a beginner to rust, so this repository is good start.
aaah ok. Then I hope you liked it! jejej
I am closing this for now. As mentioned, stages are intended to be as small as possible to fully leverage the threading as commented on the getting started guide. If I manage to find out an alternative that allows us to release the CPU between hooks I will re-open this, however, it does not seem like a bottleneck when the provided stages are properly separated.
@zjd1988 I found more problems with the code you mentioned the other day, and the above is PR I am working on to fix the bottleneck on that part, just in case you are interested.
@miguelaeh ok,i will update your latest codes and take a look
@miguelaeh Hi miguelaeh, I have checked out latest code, and found you had added aync to execute_path, but I also have a question, is possible the (n+1)th frame processed done ahead of the nth frame in certain scenarios.
Hi @zjd1988 , yes, it is. I have set a maximum of 2 * cpu cores
events that can be processed at the same time. See this: https://github.com/pipeless-ai/pipeless/blob/523be41b5127ff21087ee4ed1e0e8d38fb8a856b/pipeless/src/pipeline.rs#L173
@miguelaeh if so, disorder video frames detecting results may affect subsequent video structuring processing( tracking, intrusion etc).
Exactly @zjd1988. The changes contemplate the case of stateless
and stateful
hooks. Now, I think we should implement a sequential vs parallel executor.
Stateless hooks are designed to not maintain any internal state, making them easy to parallelize. Several instances of the same hook process frames at the same time among different processes/threads. Stateful hooks on the other hand always use the same instance of the hook, allowing it to maintain state. For example, an object counter requires to maintain as the "state" the number of objects. Or trackin models may require to maintain an internal state.
Now, the next step as you mention should be to differentiate between parallel and sequential executor.
BTW regarding stateful and stateless hooks, both types of hooks are fully implemented, however, it is pending to parse stateful hooks yet. I think we will opt for something like a comment at the top of the files when using code files and a special key in the json files when using json so the user can specify something like # make stateful
at the top of the hook file.
Update: I just added the mentioned missing parse step and the respective docs section. See https://www.pipeless.ai/docs/docs/v1/getting-started#stateless-vs-stateful-hooks
Hi @miguelaeh, wonderfull job. I readed source code and found all stages processed in execute_path,so fps will be very low when stages take a lot time, Maybe one stage have one thread,then the fps will be restricted with the most time-consuming stage, not all stages.