groupcache / groupcache-go

A high performance in memory distributed cache
Apache License 2.0
29 stars 1 forks source link

V3 Discussion #1

Closed thrawn01 closed 5 months ago

thrawn01 commented 8 months ago

I have a proposal for a new interface here https://gist.github.com/thrawn01/45131e4c225a13774de6c80618ddd5a8

I find it useful to provide example usage and documentation before beginning any major refactors to allow time for feed back and avoid needless rewrites while we flesh out the new interface.

This is a continuation of the previous conversation here https://github.com/mailgun/groupcache/issues/68

CC: @udhos @Tochemey

Tochemey commented 8 months ago

@thrawn01 I have added few comments to the gist. Overall great. These are the features I am looking at:

udhos commented 8 months ago

@thrawn01

Regarding removal of global state, the repo below is a clone of mailgun groupcache with the sole change of removing global state. It modifies several funcs to add a required "workspace" argument to pass the explicit state.

https://github.com/modernprogram/groupcache/blob/master/workspace.go

Workspace creation:

workspace := groupcache.NewWorkspace()

Perhaps it would help us to evolve from that.

Tochemey commented 8 months ago

@udhos and @thrawn01 please what is the plan?

udhos commented 8 months ago

I think we should put a short design doc with some decisions to clarify things about groupcache v3 design proposal.

For instance I have some thoughts:

1 - Should we rewrite everything from scratch or would we incrementally make changes from https://github.com/groupcache/groupcache-go towards V3? I am guessing the later?

2 - V3 interface seems to imply that we have 2 options for peers: manual or discovery. Perhaps it should have only discovery, hence "manual" should be one static discovery provider (among several others). That said, I have mixed feelings about this, since the interface for manually setting peers already provides a pluggable option for discovery libs.

3 - Can we merge back explicit state changes from https://github.com/modernprogram/groupcache into https://github.com/groupcache/groupcache-go, or is there a better idea?

4 - It would be nice to extend the API to allow for easy distributed rate limiting with group cache. An API like:

err := group.Inc(key ...)

that would run an "incrementer" function in the key owner peer in order to atomically increment a counter.

5 - Currently groupcache only keeps stats that one can optionally use to expose metrics with third-party libs, like prometheus client. I have mixed feelings about this. It seems positive not to tie groupcache to specific metrics libraries. But maybe it should provide optional "contrib" integrations to some metrics libs, much like it should support optional integration to some peer discovery plugins.

What do you guys think?

thrawn01 commented 8 months ago

I'm traveling this week so I won't have time to dig in until next week. But I'm excited about this. Agree, a clear design should be agreeable.

I have no interest in gRPC, it's slower than regular HTTP on golang, so I don't see the point.

See benchmarks here. https://github.com/duh-rpc/duh-go-benchmarks

But I'm very interested in collaboration with you all. Looking forward to it!

gedw99 commented 8 months ago

Is supporting NATS possible ?

I came here from https://github.com/Tochemey/groupcache?tab=readme-ov-file#nats-discovery-provider-setup

Tochemey commented 8 months ago

I think we should put a short design doc with some decisions to clarify things about groupcache v3 design proposal.

For instance I have some thoughts:

1 - Should we rewrite everything from scratch or would we incrementally make changes from https://github.com/groupcache/groupcache-go towards V3? I am guessing the later?

2 - V3 interface seems to imply that we have 2 options for peers: manual or discovery. Perhaps it should have only discovery, hence "manual" should be one static discovery provider (among several others). That said, I have mixed feelings about this, since the interface for manually setting peers already provides a pluggable option for discovery libs.

3 - Can we merge back explicit state changes from https://github.com/modernprogram/groupcache into https://github.com/groupcache/groupcache-go, or is there a better idea?

4 - It would be nice to extend the API to allow for easy distributed rate limiting with group cache. An API like:

err := group.Inc(key ...)

that would run an "incrementer" function in the key owner peer in order to atomically increment a counter.

5 - Currently groupcache only keeps stats that one can optionally use to expose metrics with third-party libs, like prometheus client. I have mixed feelings about this. It seems positive not to tie groupcache to specific metrics libraries. But maybe it should provide optional "contrib" integrations to some metrics libs, much like it should support optional integration to some peer discovery plugins.

What do you guys think?

AFAIK we should not rewrite everything from scratch. The existing library does the job well. Like I did in my fork we just need to add a discovery provider and allow developers to add their custom providers. Yeah we can provide some built-in providers in a separate repo where we can support:

Regarding metrics the core library should expose some of Stats that is not tied to any library.

udhos commented 8 months ago

I am curious about supporting NATS, what does it mean? Can anyone elaborate?

Tochemey commented 8 months ago

I am curious about supporting NATS, what does it mean? Can anyone elaborate?

The support of NATs is basically using NATs to discover nodes in cluster environment. You can take a look at my implementation here

To know more about NATs please check the doc out: https://nats.io/

Tochemey commented 8 months ago

I am curious about supporting NATS, what does it mean? Can anyone elaborate?

The support of NATs is basically using NATs to discover nodes in cluster environment. You can take a look at my implementation here

To know more about NATs please check the doc out: https://nats.io/

@udhos I hope this helps.

thrawn01 commented 8 months ago

I've updated the example gist https://gist.github.com/thrawn01/45131e4c225a13774de6c80618ddd5a8 to include some of the feed back received.

1 - Should we rewrite everything from scratch or would we incrementally make changes from https://github.com/groupcache/groupcache-go towards V3? I am guessing the later?

I prefer incremental changes

2 - V3 interface seems to imply that we have 2 options for peers: manual or discovery. Perhaps it should have only discovery, hence "manual" should be one static discovery provider (among several others). That said, I have mixed feelings about this, since the interface for manually setting peers already provides a pluggable option for discovery libs.

I also have mixed feelings, SetPeer() could go away and it wouldn't hurt my feelings. We could accomplish a similar thing using the peer discovery interface used in the example gist. I've updated the gist to reflect this.

3 - Can we merge back explicit state changes from https://github.com/modernprogram/groupcache into https://github.com/groupcache/groupcache-go, or is there a better idea?

That would be nice! I'm going to attempt to pull in some of those commits today and see what happens.

4 - It would be nice to extend the API to allow for easy distributed rate limiting with group cache. An API like:

err := group.Inc(key ...)

I started the gubernator project for this use case, based off my experience with groupcache. https://github.com/mailgun/gubernator

5 - Currently groupcache only keeps stats that one can optionally use to expose metrics with third-party libs, like prometheus client. I have mixed feelings about this. It seems positive not to tie groupcache to specific metrics libraries. But maybe it should provide optional "contrib" integrations to some metrics libs, much like it should support optional integration to some peer discovery plugins.

I like the idea of supporting other stats collection systems separately. It shouldn't be difficult.

thrawn01 commented 8 months ago

@udhos I've decided to go with SetPeers() I think this is the most intuitive for users and makes registering the groupcache instance with discovery as simple as

    // Create a new groupcache instance
    instance := groupcache.New(groupcache.Config{
        InstanceID: "instance-01",
    })
        // Register with discovery
    d := discovery.SpawnK8s(discovery.K8sConfig{
        OnUpdate: instance.SetPeers,
    })
    defer d.Shutdown(context.Background())
Tochemey commented 8 months ago

@udhos I've decided to go with SetPeers() I think this is the most intuitive for users and makes registering the groupcache instance with discovery as simple as

  // Create a new groupcache instance
  instance := groupcache.New(groupcache.Config{
      InstanceID: "instance-01",
  })
        // Register with discovery
  d := discovery.SpawnK8s(discovery.K8sConfig{
      OnUpdate: instance.SetPeers,
  })
  defer d.Shutdown(context.Background())

@thrawn01 I still believe we should not couple the discovery engine with the core of the library. Yes in the new version we want to add a discovery API. However someone may decide not to use it and add its own discovery engine. However for the majority we will support outbox a discovery API and built-in providers maybe in a separate repository. So the SetPeers is ok to leave it as it is.

Tochemey commented 8 months ago

The reason I am saying that is because at it is now it is really easy to implement a discovery engine.

thrawn01 commented 7 months ago

Pre-Release https://github.com/groupcache/groupcache-go/releases/tag/v3.0.0-rc.1 is now merged. Please test the v3 release and report any bugs.

Before a full release, I want to make the cache implementation replaceable during initialization.

This article noticed a significant degradation in group-cache performance in high concurrency situations due to the mutex that the default cache implementation uses. Providing a way to switch out cache implementations would allow for flexibility for those users who encounter this problem. https://dgraph.io/blog/post/introducing-ristretto-high-perf-go-cache/

Screenshot 2024-04-11 at 9 24 13 AM

thrawn01 commented 7 months ago

I'm considering renaming groupcache.SpawnDeamon() to groupcache.ListenAndServe() and Transport.SpawnTransport() to Transport.ListenAndServe() thoughts?

this would bring it into line with the naming of the standard http package. https://pkg.go.dev/net/http

udhos commented 7 months ago

Pre-Release https://github.com/groupcache/groupcache-go/releases/tag/v3.0.0-rc.1 is now merged. Please test the v3 release and report any bugs.

This is awesome. I have a toy project with classic groupcache. Will try to spare some time tonight to attempt to convert it to v3 and report findings here.

Tochemey commented 7 months ago

I'm considering renaming groupcache.SpawnDeamon() to groupcache.ListenAndServe() and Transport.SpawnTransport() to Transport.ListenAndServe() thoughts?

this would bring it into line with the naming of the standard http package. https://pkg.go.dev/net/http

This suggestion makes sense. 👍🏿

thrawn01 commented 6 months ago

https://github.com/groupcache/groupcache-go/pull/4 is ready for review. I've included an implementation of otter which is located in the contrib package.

Let me know what you think!

thrawn01 commented 6 months ago

I've merged #4 and created a second release candidate. I'm reluctant to publish a full v3 release as I currently don't have a production environment in which to test v3 groupcache. If someone is interested in testing it and reporting back, I would feel better about bumping it to a full release.

https://github.com/groupcache/groupcache-go/releases

thrawn01 commented 5 months ago

V3 is released, thank you all for your feed back and assistance!

https://github.com/groupcache/groupcache-go/releases/tag/v3.0.0