intel / DML

Intel® Data Mover Library (Intel® DML)
https://intel.github.io/DML/
MIT License
85 stars 17 forks source link

Question about async mode of DML #34

Closed Sean58238 closed 4 months ago

Sean58238 commented 1 year ago

We have some question about async mode of DML, thanks for comments: (1) If one device config two or more SWQ, does DML async mode submit the job to every WQ, how to select and allocate these jobs to these WQs ,is each WQ get the jobs balanced? (2) If a WQ has two or more engines, the completion of the jobs is disordered or sequential, in other words, and does any flag can control the dml_wait_job() function keep the jobs exaction sequence.

mzhukova commented 12 months ago

Hi @Sean58238, As for the first questions, in short DML would iterate over all available DSA instances on a NUMA node (without crossing the node boundary), and within each instance it would iterate over all the work queues until the job is submitted successfully (for instance, if everything is free, we would submit to device 0, queue 0; if this is busy, we would go to device 0, queue 1 ... queue N until submission happens), then the position would be stored and the next time, the process would be started over from this place. As for the second questions, I didn't quite get what you mean by "disordered or sequential" in case of having multiple engines, could you please clarify? But the balancing mechanism is not available to the user and couldn't be altered.

Hope this clarifies things.

mzhukova commented 11 months ago

hi @Sean58238 does this answer your questions?

Sean58238 commented 11 months ago

For second question, I thinks it means if want to move a large data(e.g. 100MB), it can split with some small chunks (or may these chunks has different size, like 4k or 8k) to submit, the question is wait can ensure the results of transfer still in original order?