ruby-concurrency / concurrent-ruby

Modern concurrency tools including agents, futures, promises, thread pools, supervisors, and more. Inspired by Erlang, Clojure, Scala, Go, Java, JavaScript, and classic concurrency patterns.
https://ruby-concurrency.github.io/concurrent-ruby/
Other
5.7k stars 420 forks source link

2.0 Some Specific Technical Changes #542

Closed jdantonio closed 2 years ago

jdantonio commented 8 years ago

Concurrent Ruby 1.0 has been very successful, but we made some mistakes along the way. We had a few ideas that didn't work out as well as we had hoped. We discovered that some features were more trouble than they were worth. And we gained a better understanding of what our users want. Based on this learning we will make several specific changes while designing and building 2.0.

pitr-ch commented 8 years ago

I would add following points:

Thanks Jerry, I appreciate the discussion.

jdantonio commented 8 years ago
pitr-ch commented 8 years ago
  • We can remove the Obligation module but not the API contract. This speaks directly to our Consistency principle. I created the Obligation module to ensure that all "future" classes presented a common API. It was successful at that. Now that I've created numerous shared specs we can enforce that consistency in other ways. We also have better documentation and a better common understanding, which also help.

Just to clarify, I meant to suggest to remove Obligation in light of new Promises, Future and Event are targeting to replace all other classes using Obligation. If there are some classes using it left then Obligation should be kept. It's good to have it to represent the interface.

  • Explicit require: This is non-negotiable. If every file explicitly requires the things it needs, then each file only needs to require the top-level needs. It also keeps the require statements closest to the place they are used. This is significantly easier than maintaining a separate set of files solely to manage requires. I've enforced this rule on teams I've managed at places I've worked and it's never been difficult for professionals. It just requires a little discipline and diligence on the part of the programmer. The vast majority of the files in the project already meet this requirement. I know this because I added those require statements myself in preparation for the 1.0 release.

I am sorry but I don't believe we can close this one just yet.

Let me repeat what I've mentioned in #395 issue for clarity: I like this approach it's clear and simple. Unfortunately it's also repetitive and quite error-prone when files are updated over time. I see only one obstacle which prevents this approach: we did not formulate how do we test or otherwise ensure that the require statements are correct in every file of our gem. If we formulate the testing strategy then this will be this best option.

In the approach suggested above (having a list of requirable files), each requirable file represents a logical unit of the gem, e.g. a data structure or a concurrent abstraction. Then all tests for a given logical unit can require just the unit then run the tests. If the tests are executed in separation it ensures that the given unit works when required separately. I have a hard time to formulate something similar for every file. (I am assuming class per file policy, not all classes for a given logical unit in one file.) I would like to have a solution for this problem before we commit to make every file requirable by users.

  • One global thread pool: We will not retain the global 'fast' pool. Any user with enough concurrency knowledge to understand its use will be capable of using the factory method.

After thinking about this more, I think we should keep both pools for non-blocking and blocking tasks.

Simple usage does not require both. An user with few async tasks will just post them to blocking pool and everything will be fine. Since non-blocking tasks can be posted to blocking pool, blocking tasks to non-blocking must not (it would be good to have at least some best afford detection of this).

However problems arise when an user has many tasks to process. If all of them are just fed to blocking pool it will have bad performance (context switching) as it will try to create thread for every task. It can also deadlock when all tasks which were able to obtain a thread are blocked on tasks in queue.

If nonblocking tasks dominate then it's good approach to post [non-]blocking tasks to their particular pools, then it's naturally throttled by the nonblocking pool (which has a thread per cpu core and executes as fast as possible). Remaining tasks are on blocking pool which can keep up since the assumption for this case is that there is not many of them.

If there is more than little of blocking tasks then they need to be throttled with other mechanism (which is why we should provide one). Currently users can create extra pools, which is far from ideal since it creates another threads. Or they can create pools of actors to represent e.g. a pool of 10 DB connections, which can be somewhat heavy weight.

For the problems described above I think we should keep both pools and do a much better job on education of our users how to use them properly, and we should prototype some throttling tools.

We currently have nice shortcuts using Symbols executor: :io and executor: :fast which would be lost if nonblocking pool is removed. For non-blocking pool it would have to be always a reference.

Fast/io, Non-/blocking names don't feel good, would you have other ideas?

  • Cancellation: This is not a high priority. It can be added after 2.0 is released.
  • Throttling: This is not a high priority. It can be added after 2.0 is released.

Yeah we do not have to have stable solution, but a prototype should be in place for 2.0, users have asked for both. I already have working prototypes.

jdantonio commented 8 years ago

I meant to suggest to remove Obligation in light of new Promises, Future and Event are targeting to replace all other classes using Obligation. If there are some classes using it

Other classes are using it but I'd rather enforce the contract through common tests. The module itself has become unwieldy.

we did not formulate how do we test or otherwise ensure that the require statements are correct

We don't need to test it. So long as every developer makes an concerted effort to require its direct dependencies we'll be fine. We'll probably miss something occasionally, and that's OK. When we receive PRs like #551 we'll merge them. Eventually we'll get there. Please do not push on this issue any more. I have made my decision.

If nonblocking tasks dominate then it's good approach to post [non-]blocking tasks to their particular pools

Any user experiencing this will be able to create their own thread pool and use the :executor option to post to that pool. We don't have to create a global thread pool for them.

do a much better job on education of our users how to use them properly

Users with the knowledge to understand the difference will also have the knowledge to create their own thread pool and use the :executor option. If we want to create a convenience method that creates a thread pool with one thread per processor I'm OK with it. But I don't want a second global thread pool. I've made my decision.

[Cancellation/Throttling] but a prototype should be in place for 2.0

We can prototype in Edge. If you get this done in time for 2.0 then we can consider releasing them, but if they aren't ready when 2.0 is ready I don't want to delay 2.0 until they are done. Which I hope you would agree with--you are the one who wants to move fast on 2.0.

chrisseaton commented 2 years ago

A lot of time's passed. We'll re-evaluate what we want from 2.0 in the future.