hardillb commented 2 weeks ago

part of #4726

Description

Tracks all topics published to on the Team Broker.

Adds 1 new HTTP endpoint

GET /api/v1/teams/:teamId/broker/topics returns an array of strings representing the topics the team has published to

Requires team Member or Owner to access list of topics

Needs:

~~testing~~
~~a new permission adding to access the endpoint~~

Currently caches topics for 1 hour, checks for removing topics every 30seconds, probably should be configurable.

Will benefit from using Redis for this so cache is preserved over restarts and if multiple instances of forge app running.

Related Issue(s)

4726

Checklist

[x] I have read the contribution guidelines
[ ] Suitable unit/system level tests have been added and they pass
[ ] Documentation has been updated
- [ ] Upgrade instructions
- [ ] Configuration details
- [ ] Concepts
[ ] Changes flowforge.yml?
- [ ] Issue/PR raised on FlowFuse/helm to update ConfigMap Template
- [ ] Issue/PR raised on FlowFuse/CloudProject to update values for Staging/Production

Labels

[ ] Includes a DB migration? -> add the area:migration label

codecov[bot] commented 2 weeks ago

Codecov Report

Attention: Patch coverage is 59.09091% with 18 lines in your changes missing coverage. Please review.

Project coverage is 78.73%. Comparing base (9c1babb) to head (3e5816d). Report is 38 commits behind head on main.

Files with missing lines	Patch %	Lines
forge/ee/lib/teamBroker/index.js	55.88%	15 Missing :warning:
forge/housekeeper/tasks/teamBroker.js	25.00%	3 Missing :warning:

Additional details and impacted files

```diff @@ Coverage Diff @@ ## main #4748 +/- ## ========================================== - Coverage 78.79% 78.73% -0.06% ========================================== Files 311 312 +1 Lines 14787 14831 +44 Branches 3387 3395 +8 ========================================== + Hits 11651 11677 +26 - Misses 3136 3154 +18 ``` | [Flag](https://app.codecov.io/gh/FlowFuse/flowfuse/pull/4748/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FlowFuse) | Coverage Δ | | |---|---|---| | [backend](https://app.codecov.io/gh/FlowFuse/flowfuse/pull/4748/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FlowFuse) | `78.73% <59.09%> (-0.06%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=FlowFuse#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

🚨 Try these New Features:

Flaky Tests Detection - Detect and resolve failed and flaky tests
JS Bundle Analysis - Avoid shipping oversized bundles

hardillb commented 2 weeks ago

I did it this way to make it easier to include the project nodes changes later (when they will publish without the mount point so will need to include the full topic prefix).

Might need to discuss/think about it some more.

hardillb commented 2 weeks ago

Not sure how to write a test for those last 5 lines as that would mean adding a 30 second pause to ensure the interval runs (but even then it won't remove any entries for 30mins)

joepavitt commented 2 weeks ago

Currently caches topics for 1 hour, checks for removing topics every 30seconds, probably should be configurable.

Is there an opportunity for a longer term solution here? May be valid that some data only gets published every 24 hours for example.

hardillb commented 2 weeks ago

Is there an opportunity for a longer term solution here? May be valid that some data only gets published every 24 hours for example.

These are just starting values, but having a longer caching period is likely to not be useful in it's current form as the cache will get cleared every time the forge app restarts (e.g. every time we do a CI deployment....)

There is a comment in the code about using Redis as a better cache, but this is a MUCH bigger solution

joepavitt commented 2 weeks ago

Why is the URL /api/v1/teams/:teamId/broker/topicList and not /api/v1/teams/:teamId/broker/topics

The latter is more RESTful?

knolleary commented 2 weeks ago

Think about longevity of the cache - and dealing with CI reloading the app.

Some options that come to mind:

Introduce redis (more likely valkey). This will also mean we aren't storing state in memory that won't horizontally scale. However it does add the overhead work on adding another component to the architecture. It is inevitable... the question is whether this is the feature that pushes us into action on it.
Store topic list in existing DB. Don't really want to add this load to the DB in realtime, but could be cached in memory and periodically written to DB in batches (including in the shutdown hook). Doesn't help with scaling (as there is state in memory), but will be persistent across restarts.

In terms of the existing caching TTL values, I'd suggest we default for longer intervals - assuming we (eventually) solve the persistence of state. I'd suggest 25hrs ttl and once a day purging - I don't think the level of usage for in-memory caching will be a problem in the short term for it to be quite generous.

joepavitt commented 2 weeks ago

Leaving the technical review here to @knolleary - conceptually though, love the work, thanks for finding a solution Ben!

hardillb commented 2 weeks ago

Increasing the cache to 25hours is an easy change, but moving to clearing the cache to once a day, will mean moving that code to the house keeping tasks to get round the restarts.

Will look at that shortly

hardillb commented 2 weeks ago

Stashed the topic cache in the database across restarts

hardillb commented 2 weeks ago

Need to check that this DB cache actually will work on K8s as I think it starts the new instance, before telling the old one to shutdown.

This means the cache will load from the DB before it's populated. Might need to add a delay (and find away to force the app.settings.get to re-read from the database).

knolleary commented 2 weeks ago

@hardillb lets discuss before you throw commits at the PR

joepavitt commented 1 week ago

Is this good to merge? I've not got too many opinions here as it's more server-side work. So if Nick says go, I vote, we go

FlowFuse / flowfuse

Track topics used by a Team on Team Broker #4748

Description

Related Issue(s)

4726

Checklist

Labels

Codecov Report