Open michalmuskala opened 6 years ago
That's pretty fair. I think the simplest approach would be to mark all 'heavy' NIFs as dirty. I tried to mark some of them dirty before, got a slight decrease in performance and switched back. Of course, this is a bit irresponsible, if we aim for production applications of the lib.
Can we run into problems with OTPs, that were compiled without dirty NIFs support? Or nowadays it's almost impossible?
OTP 20 made dirty NIFs obligatory because it uses them internally, so I think it's perfectly fine to require them.
Elixir itself will most probably require OTP 20 in the 1.8 release scheduled for January next year.
Right now the library uses "regular" NIFs. In general, for playing nice with the soft-realtime guarantees of the VM regular NIFs should execute in under 1ms. When they execute longer, it might throw off the load balancing of the schedulers (OS threads actually executing Erlang code) and lead to what is known as "scheduler collapse".
Fortunately, there are tools to prevent that. If the job can be easily chunked the best solution is to periodically call
enif_consume_timeslice
to check if code should yield and callenif_schedule_nif
to yield and allow the VM to execute other tasks, if necessary. If chunking is not an option, the NIFs can be marked to execute on "dirty" schedulers - separate OS threads where execution can take however long it can and does not affect how regular schedulers work. The downside is that it incurs a context switch to a different OS thread - if the computation is complex enough, that overhead should not be noticeable.After briefly looking at the code, it seems to be it would be good for the library to mark most functions as dirty (especially the element-wise operations or dot-product). A possible approach would also be to provide two implementations - dirty and not and call one of them depending on the size of the matrix. This might bring in more complexity then desired, though. An example of a library with a more intricate scheduler handling is enacl, which depending on the complexity runs functions on regular or dirty schedulers.