wkcn / MobulaOP

A Simple & Flexible Cross Framework Operators Toolkit
MIT License
164 stars 21 forks source link

Does these operators respect the asynchronous execution in MXNet? #13

Closed eric-haibin-lin closed 4 years ago

eric-haibin-lin commented 5 years ago

This is great work. Is this tested with multi GPUs and can be executed in parallel?

wkcn commented 5 years ago

Yes. MobulaOP will call 'wait_to_read' for inputs and 'wait_to_write' for outputs. It's available to execute in parallel with multiple GPUs.

https://github.com/wkcn/MobulaOP/blob/master/mobula/func.py#L92

wkcn commented 5 years ago

@eric-haibin-lin Sorry that I misunderstood it.

MobulaOP calls these C API of MXNet, namely MXNDArrayWaitToRead and MXNDArrayWaitToWrite, which breaks the asynchronous execution in MXNet and drops the performance.

Recently, I plan to change synchronous execution to asynchronous execution, by using MXTVMBridge, however there is a problem about ABI compatibility. [issue], [discussion]

Will MXNet provide an API to visit engine->PushSync, to implement an outside asynchronous function?

Thanks!

wkcn commented 5 years ago

Hi! Current MobulaOP uses 'MXTVMBridge' to support asychronous execution.

However, there is an ABI compatibility problem since the class WrappedFunc includes 'std::function', which is implemented differently among different compilers.

I use the GCC4 header file 'functional' in MobulaOP to address the problem. However, I meet the license problem because the header filer is under GPL license.

I will remove the header file later.

eric-haibin-lin commented 5 years ago

Sorry, I've been busy with a few other things. I haven't read the tvm discussion thread but will do later today. Does the ABI issue indicate that mobula op and mxnet must be compiled using the same version of GCC?

wkcn commented 5 years ago

@eric-haibin-lin Thank you! It is not necessary to compile MobulaOP and MXNet with the same version of GCC. We only need to keep the same implementation of 'std::function'.

It will be better when CPackedFunc is provided. We discuss the problem in https://discuss.tvm.ai/t/the-abi-compatibility-of-packedfunc/1601/15

wkcn commented 5 years ago

Hi @eric-haibin-lin , I have added the TVM bridge into MobulaOP, and the ABI compatibility problem has been addressed. MobulaOP enables the asynchronous execution for MXNet by default : )

wkcn commented 4 years ago

Close it : ) MobulaOP supports the asynchronous execution for MXNet (nightly build) on Windows and Linux.