Closed layanto closed 5 years ago
It isn't no. The message connector prior to 3.0 was very crude, which didn't really matter in smaller deployments. However once we built deepstreamHub we needed to find a different approach that allows deepstream to communicate alot more effectively, since the broadcasting approach becomes a significant bottle neck once you have alot of messages, topics or servers.
Because of this we ended up with a peer to peer network for deepstreams that allows them to directly communicate with each other in a smart manner. Due to the large amount of work required and usage within deepstreamHub this has become part of our enterprise offering.
You can however run hot/cold server deployments for high availability using the opensource server, so that u have no downtime during rolling upgrades and unexpected node failures!
I wish there is still open source clustering solution available for smaller deployments. Can you consider reinstating the crude message connector which as you said is ok for smaller deployments or releasing the peer to peer network but limit the number of nodes in open source deepstream server? This way open source solution available for small deployments and when scale becomes an issue, can migrate to either deepstreamHub or enterprise.
I'm disappointed that clustering has been removed from the open source project.
I understand that a lot of work went into the p2p solution, however it appears you're removing features from the community build to push the enterprise offering. This makes me less inclined to use Deepstream for future projects because I don't know what other features might be removed.
I appreciate all of the work that's gone into Deepstream, and I hope you'll reconsider reintroducing this core functionality.
I agree with John. IMO this move will hurt Deepstream's Enterprise push, rather than encourage it.
Deepstream is definitely a great product and I'm currently using this to power one of my fledgling projects, and settled on it based on its core features, Clustering being one. Removing it implies that I need to do custom patchwork (which is daunting at this point) or looking for another solution, which may mean abandoning Deepstream.
Any users of Deepstream who have scaled up sufficiently to pay for enterprise accounts, will outsource their server management to you anyway - not just for technology, but also for your expert team behind it & the fact that they no longer need to maintain a dedicated team for server management.
Removing Clustering, prevents users from even scaling up to enterprise levels, to begin with, and will instead cause them to abandon Deepstream, in favour of another solution.
Hope you all reconsider.
On 28 July 2017 at 10:00, John Wehr notifications@github.com wrote:
I'm disappointed that clustering has been removed from the open source project.
I understand that a lot of work went into the p2p solution, however it appears you're removing features from the community build to push the enterprise offering. This makes me less inclined to use Deepstream for future projects because I don't know what other features might be removed.
I appreciate all of the work that's gone into Deepstream, and I hope you'll reconsider reintroducing this core functionality.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/deepstreamIO/deepstream.io/issues/765#issuecomment-318556380, or mute the thread https://github.com/notifications/unsubscribe-auth/AOK1-LDcQZAoLJtUqJ9AaUdodWznPKVlks5sSWPKgaJpZM4OlHuy .
Thank you for your opinions, I certainly understand where you are coming from. It took us a while to come to this decision and we went through various different iterations of plans and components before getting here.
We've been dealing lately with larger and larger enterprises as well as ensuring deepstreamhHub scales well, both which requires us to dedicate the majority of our engineering team to their particular requirements. For us, this meant two things:
As a result, as you might have noticed, new features in deepstream have slowed down significantly and our response time to error reports has risen. We love deepstream as much as you do and would love to see it grow!
It is also worth mentioning that we have put in alot of effort prior to this release and going forward in making a single node as performant as possible and will continue to do so. This means we are still onpar/far exceed technologies like MeteorJS which have zero concepts of clustering and yet are still used extensively in production. Deepstream can and will easily ensure your product can accomodate large amounts of users and traffic with a single node, and we will continuously be pushing those boundaries going forwards.
For those with non commercial products or open-source offerings we would be more than happy to provide a free license, please get in contact with us via sending an email to info@deepstreamhub.com. And if you are using deepstream as part of a larger commercial company and are happy with it – please consider supporting us in developing it by talking to your managers about getting the clustering plugin.
@wehriam In regards to us pulling out more features, we can guarantee that we won't be doing this again. We have an exciting roadmap to enhance core features in deepstream such as list deltas, schema/schema validation and other awesome improvements going forward, as well as multiple open source client libraries. We have also just added HTTP support and laid the foundation for other endpoints, meaning things will only get more awesome going forward!
@DhavalW Finally, in regards to enterprise level scaling, it is also worth mentioning that we already have multiple tiers on our deepstream platform which allows millions of messages and provides a cache/database, autoscaling and other features out of the box, starting totally free and paid for tiers at a rate of $20 a month (less than the price of a small ec2 instance). So we already provide outsourcing of server management with a dedicated team at a rate that is cheaper than running a single node (not to mention the other costs). All using the exact same clients and functionality as deepstream which means moving between the two is as easy as exporting data from one and importing it to another, or vice versa.
For use case where single performant deepstream node is sufficient, any documentation/tutorial on how to implement hot standby for HA?
What is the cost of the clustering plugin?
Also, can third-parties create plugins?
In regards to third-party plugins, that should be fine. We don't have explicit internal APIs for generic plugins yet though, only for cache, storage, permissioning, connection endpoint and authentication. As such we can't guarantee supporting them at this time during releases, and if the plugin modifies behaviour of the server itself would also fall under the AGPL license.
Please let me know if that makes sense!
Would be good if there is plugin API so community can create open source cluster messaging plugin. There are use cases where load or scaling is not the issue but redundancy and on premise are required. Having cluster for on premise use case provide both redundancy and better performance when there is no failure. I assume deepstreamHub Enterprise on premise offering is intended for large deployment.
I would also like to know what the enterprise clustering plugin costs. We just decided to replace our Firebase setup and started migrating to deeplink. We need clustering because we have a very complex server setup in 3 different regions around the world.
@wehriam Hi John,
could you please send me an email to juliet.matsai@deepstreamhub.com with a few more details about your project and which company we should provide the pricing for?
Thanks, Juliet
@marcelgoya Marcel, could you please send me an email to juliet.matsai@deepstreamhub.com to discuss the enterprise clustering plugin costs. Please include there a brief description of your project and the company details (ideally a website) for us to better understand your needs. Thanks, Juliet
@julietmatsai Just done that :)
Related note: I see in the product comparison page that there are limits for the number of connections, messages, throughput, etc for the 'open source' product https://deepstreamhub.com/compare-products/
Are these limits defined during the build step for installers you distribute on your site, or in the actual source code? Are these just practical real world limits, or artificially enforced handicaps on the open source version?
@mattknoxca Don't bother looking into their open source version, unless you only want to use one server in production. Their open source version is basically a trial edition and they want you to use their enterprise products, which are btw very expensive because they're aimed at large companies.
@marcelgoya Ya, disappointing. Looks like it is time to 'rethink' rethinkdb, which seems to be picking up momentum again under the linux foundation
@mattknoxca the limits are just practical real world limits. The opensource version is under AGPL which basically means that even if we did try to add 'limitations' you would be able to remove them anyways.
Edit: I initially said in the actual source code, but meant that as a limitation of source code which in hindsight makes no sense since those limitations are based on real world limits!
Hi all, I've released a preview of community supported clustering at deepstream.io-cluster. Please note that it builds upon changes made in 711ba0c8161bb49c5808d38419151822a8acc707 and should be considered an alpha until a 1.0.0 release.
In recognition of deepstreamHub GmbH's contributions and business I have attempted to direct users looking for a commercially supported product to https://deepstreamhub.com/enterprise/ wherever possible.
Deepstream.io-cluster extends the base Deepstream class, and uses two sets of ports. To broadcast messages to the entire cluster, nodes use Nanomsg's pubsub protocol. To direct messages to a specific node, they use the pipeline protocol.
Clusters are easy to set up as nodes bootstrap off of each other. While each node has to connect to every other node, network usage should be reasonably efficient as most messages are routed directly.
I hope the Deepstream community finds this to be a useful intermediate clustering solution, and look forward to building it into a robust component suitable for production deployments.
@wehriam How is the messaging different from pre 3.x deepstream? Is this more performant? I currently use deepstream binary and not in nodejs. Will you be releasing binaries too like deepstreamio? Instead of linking to a specific commit of deepstreamio, I wonder if possible to have deepstreamio-cluster as a plugin and propose plugin API to deepstreamio since deepstreamio currently does not have plugin API.
Hello @wehriam, thanks for putting in the effort in extending DSv3 and building a (free) clustering solution.
Could you elaborate a little bit more on the architecture of your solution? Is it similar in concept to Memcached clustering where each Memcached server operates independently while the client does the majority of the work? As a follow up, for those of us less familiar, how does this solution differs from DSv2/v3's "official" cluster solution?
Also, have you run any performance benchmarks on your solution? Do you have any preliminary ideas on how it scales?
Sorry for asking so many questions, and I understand the software is still in its infancy stage. Perhaps adding this information to README would be useful.
@layanto I like your idea of proposing a plugin API to allow extensions to DS. Have you given further thoughts on what this would look like?
Is this more performant?
Will you be releasing binaries too like deepstreamio?
how does this solution differs from DSv2/v3's "official" cluster solution?
Also, have you run any performance benchmarks on your solution?
Do you have any preliminary ideas on how it scales?
@layanto and @charleswhchan - I would appreciate any assistance you can provide to produce tangible information on questions like these. The bulk of the implementation is only about three-hundred lines of code, reviewing it will probably be more helpful than my abstract answers.
I am not familiar with the commercial cluster solution and cannot comment on how it is implemented.
Deepstream.io-cluster implements a peer-to-peer network, and nodes communicate with each other directly. Unlike a structured overlay peer-to-peer network, each node is always connected to all of the other nodes. This will be very performant, and the tests include a 16-node cluster.
Instead of linking to a specific commit of deepstreamio, I wonder if possible to have deepstreamio-cluster as a plugin and propose plugin API to deepstreamio since deepstreamio currently does not have plugin API.
Deepstream.io will become a deepstream.io-cluster peer dependency once the changes in 711ba0c8161bb49c5808d38419151822a8acc707 are part of a versioned release. I am not aware of a way to package it as a plugin.
@wehriam Interesting work! We don't support the clusterNode as an actual API, but I looked through your code and I don't think any changes we make would have any rippling affects. We won't be exposing it as an actual plugin though / have concrete APIs, because we are constantly refactoring things to reduce the amount of messages that the cluster needs to send.
Since you are using nanomsg, it would be quite hard to bundle into an executable since we actually had to add our C++ bindings into the nodeJS build itself.
If you have any questions on why an API exists / is the way it is, feel free to just ping me
Thanks @yasserf! I sincerely appreciate all of your work that made it possible. After you published the ClusterNode class it was largely trivial and took about a day to implement. A few asks, in order of precedence:
A 3.0.2
release incorporating changes from 711ba0c8161bb49c5808d38419151822a8acc707 so I can make deepstream.io a peerDependency rather than using a commit as a dependency.
Any cluster-focused tests or benchmarks you could share that might be adapted for use with deepstream.io-cluster
Your thoughts on my approach to featuring deepstreamHub GmbH's enterprise offerings. My goals here are to 1) promote deepstreamHub to fund your work on deepstream.io and 2) emphasize that deepstream.io-cluster is community supported. I'll let you get paid to handle the issues, I only want pull requests!
Your thoughts on how deepstream.io-cluster could be implemented without extending the Deepstream base class. Am I correct in guessing that the commercial solution takes a similar approach?
Since you are using nanomsg, it would be quite hard to bundle into an executable since we actually had to add our C++ bindings into the nodeJS build itself.
In the past I've used pkg to create installers for Windows, OSX, and Debian that distribute .node
binding packages with the main binary. Naturally it's not as clean of a solution but it could be a viable path if there was demand.
Noticed the config in deepstream 3.1.0 includes configuration for message plugin again (was removed in 3.0). Does this mean deepstream 3.1.0 support clustering like deepstream 2.x?
@layanto I've checked and can see this was done by accident. We will be releasing v4
of deepstream soon in which we'll make sure to remove this option.
I can see that this has been added to the package-conf.yml
file, meaning only users who download the binary will see this option in their config.
@yasserf Not only my own thoughts (as I am a immature developer) but just few facts about the licensing.
If you see the history of CouchDB you will find out Apache CouchDB (merged with CouchBase) is progressing upward now and where is CouchOne only enterprise people know. Another example is StrongLoop soulution Strong Arc that was only through subscription. They (Strongloop by IBM) have stopped that and converted that to API Microgateway and that is now open source. I mean both of the solutions tried the paid version but now reverted to open source.
I think hosted solution and enterprise solution should be paid and source code should be open.
Few advantage of open source:
Hey, I agree which is why all of our core value proposition is totally open source. We have also spent a couple of months now purely concentrating on revamping deepstream, client and protocol all of which is opensource. A single server can easily provide the same scale of most other opensource software and there are also opensource cluster alternatives if required!
Cheers, Yasser
@yasserf thanks for update.
there are also opensource cluster alternatives if required!
You mean based on core deepstream.io?
Are there any comparative study done with other realtime application frame works like gun.js
Regards, Gagan
Thanks @wehriam I'll check deepstream.io-cluster.
V4 now supports message plugins, we just don't provide any by default. However all clustering state distribution logic is back into OS so message bus should only be a couple hundred lines of boiler plate to work!
Closing this issue as V3 is not really supported anymore other than critical fixes (things changed so much since then).
With message connector removed in deepstream 3.0, is it still possible to configure deepstream for clustering and HA?