Closed wkloucek closed 7 months ago
Was discussed at the Hack-Week by:
@tbsbdr please schedule for next sprint since this is blocking further growth including the other NATS related issues mentioned by @wkloucek above
Already worked on, see status.
Already worked on, see status.
true. thanks for the spotlight. but there is more to do, right?
Is this really fulfilled?
We now have a nats-js
registry. But what about the cache?
@wkloucek isn't the cache already using nats-js
store? (The nats-js
store was already using the key-value store interface of jetstream. Only the registry implementation was not.)
@wkloucek isn't the cache already using
nats-js
store? (Thenats-js
store was already using the key-value store interface of jetstream. Only the registry implementation was not.)
My last info is that the cache does not work. See also https://github.com/owncloud/ocis/issues/7049
But there is also more than just a working cache / store / registry implementation when looking at all the linked tickets. We please need to clarify all operational questions. Can I use a memory backed stream? Who is responsible for creating streams? Who is responsible for configuring stream replicas. Are we clean when it comes to retention. Are we using the KV store / cache in a performant way?
Eg. the registry could also be a memory backed stream if that has advantages
I see. I wasn't aware of https://github.com/owncloud/ocis/issues/7049 Seems like a standard panic. I'll take a look.
Regarding the other questions. I have no clue :) Should we have another meeting where we discuss where we stand and what needs to be done?
Regarding the other questions. I have no clue :) Should we have another meeting where we discuss where we stand and what needs to be done?
To be honest since https://github.com/owncloud/ocis/issues/7272#issuecomment-1715775681 nothing really changed. Those questions still need a answer (and modified code if needed). For that it might be helpful to read NATS (Jetstream) documentation. I already read parts of it and can be there as a sparring partner. But in general it makes sense to have a NATS "expert" in the oCIS development team since it's a really crucial part of oCIS.
Not so much fan of the "expert" pattern. I would prefer everybody in the team to know about nats (jetstream) as it is the backbone of the system.
But still I am uncertain what still needs to be done and where the biggest pain points are. Your questions in https://github.com/owncloud/ocis/issues/7272#issuecomment-1821453735 more sound like a "how do we want to do it" then "how do we have to do it" questions.
I'm happy to drive natsjs improvements. I just don't know where to start.
Not so much fan of the "expert" pattern. I would prefer everybody in the team to know about nats (jetstream) as it is the backbone of the system.
Also fine for me. But probably one person needs to go ahead since we can't dedicate the full team to reading documentation for 2 days, right?
But still I am uncertain what still needs to be done and where the biggest pain points are. Your questions in #7272 (comment) more sound like a "how do we want to do it" then "how do we have to do it" questions.
I'm happy to drive natsjs improvements. I just don't know where to start.
A first questions would be eg. https://github.com/owncloud/ocis/issues/7119: Am I allowed to use memory streams? If so, how can I configure them? The ticket already talks about benefits of memory streams (see benchmark) but also about the problem when currently trying to use memory streams (immutable).
Next question: is the new registry implementation actually distributing load? The nats
registry didn't do that from what I know (see https://github.com/owncloud/ocis/issues/7188)
Oki.
I added another NATS topic which could really help for our SaaS: https://github.com/owncloud/ocis/issues/7801
Seems like the natsjs
registry triggers some excessive logging on the NATS side: https://github.com/owncloud/ocis/issues/7948
@kobergj @wkloucek We need to check the status of the NATs implementation please.
@kobergj closable?
What we identified during that status meeting:
https://github.com/owncloud/ocis/issues/7231#issuecomment-1905861835
https://github.com/owncloud/ocis/issues/7245#issuecomment-1905855227
https://github.com/owncloud/ocis/issues/7023 -> not yet implemented but also not pressing
and one cache was still on file storage instead on memory storage :thinking:
https://github.com/owncloud/ocis/issues/7231#issuecomment-1905861835
Will look into that today
https://github.com/owncloud/ocis/issues/7245#issuecomment-1905855227
This is just changing default values. Should we do that for the single binary too?
This needs to be tackled with a followup ticket
and one cache was still on file storage instead on memory storage 🤔
No, not a cache. It was the registry. This is already fixed with https://github.com/owncloud/ocis/pull/8236
https://github.com/owncloud/ocis/issues/7245#issuecomment-1905855227
This is just changing default values. Should we do that for the single binary too?
Please do so, yes.
No, not a cache. It was the registry. This is already fixed with #8236
Thanks for keeping that information safe! I already forgot about it.
Please don't forget https://github.com/owncloud/enterprise/issues/6354
Discovered during another review:
KV_cache-userinfo maxAge could be higher, but invalidation / extra validation need -> @kobergj will create a extra ticket
Guess we tackled all tickets here. I'll close this one for now.
User Story
Acceptance Criteria
Is your feature request related to a problem? Please describe.
As a user I want to have as less components as possible. I would love to use NATS as registry / cache / store. Currently I have to use different components.
Describe the solution you'd like
Have a performant NATS registry / cache / store implementation for the KV feature based on NATS Jetstream.
Have it loadtested, it should distribute load, have sufficient speed, be stable / highly available, delete unneeded data (retention).
We also should think about dropping offical support of other registries (etcd, consul, memory, mdns, kubernetes) and caches /stores (redis, redis-sentinel, noop, memory, ocmem) implementations since many of them are only usable in a limited deployment range and / or not battle tested. Currently official documentation lists them all, so I understand them as officially supported.
Describe alternatives you've considered
Additional context
Other known NATS topics: