2.0 Some Specific Technical Changes

Concurrent Ruby 1.0 has been very successful, but we made some mistakes along the way. We had a few ideas that didn't work out as well as we had hoped. We discovered that some features were more trouble than they were worth. And we gained a better understanding of what our users want. Based on this learning we will make several specific changes while designing and building 2.0.

Remove the Dereferenceable module: This was an experiment in mimicking the immutability of functional languages. It added complexity and didn't solve the problem. We're simply going to remove it and deprecate its features.
No autoload: We adopted autoload from two separate gems that we merged in. It worked much better in those smaller, more focussed gems. We won't use autoload in 2.0.
Explicit require: Creating one file with a bunch of require statements for loading the entire gem was problematic in 1.0. Because we eat our own dog food we ran into issues with the load order. In 2.0 every file will explicitly require every file it depends on. This is one case where will be non-idiomatic and instead mimic Java and C#.
Require only what is necessary: Many of our biggest users only need a small portion of concurrent-ruby. Being forced to require the entire gem imposes an unnecessary and unwanted memory and load-time dependency. Users should be able to load only the parts that they need.
No global config: Allowing users to change the global configuration was a mistake. In a large project where multiple gems all depend on concurrent-ruby, changing the global config can cause serious side effects. In 2.0 we will remove the ability to change the global configuration. We will support dependency injection and inversion of control instead when necessary.
One global thread pool: Creating a "fast" thread pool and an "io" thread pool was a mistake. The "fast" thread pool is of no value on MRI ruby. It's inclusion has caused confusion with our users while providing no benefit. For convenience we will still provide a factory method to create a thread pool with these characteristics, but that's it. We'll have one and only one global thread pool that is optimized for Concurrent I/O operations and it will be the default thread pool for everything. We will remove all convenience features for the "fast" vs. "io" thread pools, remove all methods which explicitly use the "fast" thread pool, and rename the global "io" pool to simply the "global thread pool" or "global executor".

Remove the Dereferenceable module: 👍 Obligation and Observable could be removed over time as well.
No autoload: 👍 We should test requiring of c-r parts which are intended for it.
Explicit require 👎 This will be quite hard to maintain for each file. There are no tools to maintain or to test it. Could we rather have a list of files which can be required standalone for each abstraction, internal files of a given abstraction could not be required by users and they would not require all dependencies (that would be done in the top file of the given abstraction).
Require only what is necessary: 👍 We definitely need that for our users.
- I'll have look if we can prevent/warn users form requiring files which are not meant for it.
No global config: 👍
One global thread pool: 👍 for the io pool becoming the global pool, its name, and it being the default one. Yeah the names are not explanatory, it's suppose to be pool for nonblocking tasks and for blocking tasks, which is hard to pick names for. I would just shift it to the advanced part of the API, not removing it completely though. :fast could be renamed to :noio though.

I would add following points:

Namespace:
- Paths are matching constant names
- Concurrent space is clean as possible
- Abstractions should have its methods in its namespace not leaking to Concurrent namespace
Documentation:
- All constants not part of the API are private
- Whole public API is documented.
- For transition, method part of the API without documentation will be clearly marked as public but with pending documentation.
- non-API methods are private/protected if possible.
- all documents in doc and examples are reachable from documentation
- We should start use http://gnuu.org/2010/06/26/yard-object-oriented-diffing/ for changelogs and to see that nothing was removed on minor releases.
- We could experiment with https://github.com/dkubb/yardstick
Align APIs: atomics API, check thread safe that it matches rest of the gem
Extra files: clean up/organise doc/examples/benchmarks
Synchronisation layer: Migrate rest of the abstractions, remove (or improve) internal LockableObject
Cancellation: We need a good story for cancellation, the future PR has a good candidate for cooperative cancelation which can be used with any abstraction.
Throttling: We need mechanisms for throttling non-blocking concurrency or to provide backpressure for blocking.
Development: in master, no branches, we keep just adding features and deprecating old until major release.

Thanks Jerry, I appreciate the discussion.

I agree that Observable can be removed.
We can remove the Obligation module but not the API contract. This speaks directly to our Consistency principle. I created the Obligation module to ensure that all "future" classes presented a common API. It was successful at that. Now that I've created numerous shared specs we can enforce that consistency in other ways. We also have better documentation and a better common understanding, which also help.
Explicit require: This is non-negotiable. If every file explicitly requires the things it needs, then each file only needs to require the top-level needs. It also keeps the require statements closest to the place they are used. This is significantly easier than maintaining a separate set of files solely to manage requires. I've enforced this rule on teams I've managed at places I've worked and it's never been difficult for professionals. It just requires a little discipline and diligence on the part of the programmer. The vast majority of the files in the project already meet this requirement. I know this because I added those require statements myself in preparation for the 1.0 release.
Require only what is necessary: This is handled automatically by having all files explicitly require their dependencies.
One global thread pool: We will not retain the global 'fast' pool. Any user with enough concurrency knowledge to understand its use will be capable of using the factory method.
Namespace: Good suggestion. I will update my post with my thoughts on this later.
Documentation: Good suggestion. I will update my post with my thoughts on this later.
Align APIs: That's been covered by Consistency in the guiding principals post. I will post specific API documents as I write them.
Extra files: Good suggestion. I will update my post with my thoughts on this later.
Synchronisation layer: Good suggestion. I will update my post with my thoughts on this later.
Cancellation: This is not a high priority. It can be added after 2.0 is released.
Throttling: This is not a high priority. It can be added after 2.0 is released.
Branching, deprecation, merging, etc.: Agree 100% that we need a set of guiding rules. I will post more on this topic later. We need to be in agreement on other topics first. We can worry about source control issues later.

We can remove the Obligation module but not the API contract. This speaks directly to our Consistency principle. I created the Obligation module to ensure that all "future" classes presented a common API. It was successful at that. Now that I've created numerous shared specs we can enforce that consistency in other ways. We also have better documentation and a better common understanding, which also help.

Just to clarify, I meant to suggest to remove Obligation in light of new Promises, Future and Event are targeting to replace all other classes using Obligation. If there are some classes using it left then Obligation should be kept. It's good to have it to represent the interface.

Explicit require: This is non-negotiable. If every file explicitly requires the things it needs, then each file only needs to require the top-level needs. It also keeps the require statements closest to the place they are used. This is significantly easier than maintaining a separate set of files solely to manage requires. I've enforced this rule on teams I've managed at places I've worked and it's never been difficult for professionals. It just requires a little discipline and diligence on the part of the programmer. The vast majority of the files in the project already meet this requirement. I know this because I added those require statements myself in preparation for the 1.0 release.

I am sorry but I don't believe we can close this one just yet.

Let me repeat what I've mentioned in #395 issue for clarity: I like this approach it's clear and simple. Unfortunately it's also repetitive and quite error-prone when files are updated over time. I see only one obstacle which prevents this approach: we did not formulate how do we test or otherwise ensure that the require statements are correct in every file of our gem. If we formulate the testing strategy then this will be this best option.

In the approach suggested above (having a list of requirable files), each requirable file represents a logical unit of the gem, e.g. a data structure or a concurrent abstraction. Then all tests for a given logical unit can require just the unit then run the tests. If the tests are executed in separation it ensures that the given unit works when required separately. I have a hard time to formulate something similar for every file. (I am assuming class per file policy, not all classes for a given logical unit in one file.) I would like to have a solution for this problem before we commit to make every file requirable by users.

One global thread pool: We will not retain the global 'fast' pool. Any user with enough concurrency knowledge to understand its use will be capable of using the factory method.

After thinking about this more, I think we should keep both pools for non-blocking and blocking tasks.

Simple usage does not require both. An user with few async tasks will just post them to blocking pool and everything will be fine. Since non-blocking tasks can be posted to blocking pool, blocking tasks to non-blocking must not (it would be good to have at least some best afford detection of this).

However problems arise when an user has many tasks to process. If all of them are just fed to blocking pool it will have bad performance (context switching) as it will try to create thread for every task. It can also deadlock when all tasks which were able to obtain a thread are blocked on tasks in queue.

If nonblocking tasks dominate then it's good approach to post [non-]blocking tasks to their particular pools, then it's naturally throttled by the nonblocking pool (which has a thread per cpu core and executes as fast as possible). Remaining tasks are on blocking pool which can keep up since the assumption for this case is that there is not many of them.

If there is more than little of blocking tasks then they need to be throttled with other mechanism (which is why we should provide one). Currently users can create extra pools, which is far from ideal since it creates another threads. Or they can create pools of actors to represent e.g. a pool of 10 DB connections, which can be somewhat heavy weight.

For the problems described above I think we should keep both pools and do a much better job on education of our users how to use them properly, and we should prototype some throttling tools.

We currently have nice shortcuts using Symbols executor: :io and executor: :fast which would be lost if nonblocking pool is removed. For non-blocking pool it would have to be always a reference.

Fast/io, Non-/blocking names don't feel good, would you have other ideas?

Cancellation: This is not a high priority. It can be added after 2.0 is released.

Throttling: This is not a high priority. It can be added after 2.0 is released.

Yeah we do not have to have stable solution, but a prototype should be in place for 2.0, users have asked for both. I already have working prototypes.

I meant to suggest to remove Obligation in light of new Promises, Future and Event are targeting to replace all other classes using Obligation. If there are some classes using it

Other classes are using it but I'd rather enforce the contract through common tests. The module itself has become unwieldy.

we did not formulate how do we test or otherwise ensure that the require statements are correct

We don't need to test it. So long as every developer makes an concerted effort to require its direct dependencies we'll be fine. We'll probably miss something occasionally, and that's OK. When we receive PRs like #551 we'll merge them. Eventually we'll get there. Please do not push on this issue any more. I have made my decision.

If nonblocking tasks dominate then it's good approach to post [non-]blocking tasks to their particular pools

Any user experiencing this will be able to create their own thread pool and use the :executor option to post to that pool. We don't have to create a global thread pool for them.

do a much better job on education of our users how to use them properly

Users with the knowledge to understand the difference will also have the knowledge to create their own thread pool and use the :executor option. If we want to create a convenience method that creates a thread pool with one thread per processor I'm OK with it. But I don't want a second global thread pool. I've made my decision.

[Cancellation/Throttling] but a prototype should be in place for 2.0

We can prototype in Edge. If you get this done in time for 2.0 then we can consider releasing them, but if they aren't ready when 2.0 is ready I don't want to delay 2.0 until they are done. Which I hope you would agree with--you are the one who wants to move fast on 2.0.

A lot of time's passed. We'll re-evaluate what we want from 2.0 in the future.

ruby-concurrency / concurrent-ruby

2.0 Some Specific Technical Changes #542