Open roidelapluie opened 4 years ago
Should be simple enough to display some of the common metrics for each queue on a page.
Detail on the sharding calculations would be interesting, right now that information is only available in debug logs.
@roidelapluie this feels like it could be a hacktoberfest task?
Yes, indeed :)
Would this be achievable to a new contributor? I'd like to take a shot at it once there's enough requirements.
@Pokom I would think so, I haven't done any work with any of the UI pages but it looks like the source is here, and you'd either just be pulling existing data from remote write or filing issues/opening additional PR's to make unavailable info available to the UI page.
Like this @cstyan @csmarchbanks ?
I like the state of each shard, that could have some good information, especially the pending samples per shard. I am curious, what do last scrape and scrape duration mean?
Generally, I think a section for information about the queue is also necessary, the whole queue is either resharding or running, not individual shards. Each shard would be running, stopping, or stopped I think.
Might be nice to have a collapsible section for each overall queue that shows the queue config and shard information in a table?
I like the state of each shard, that could have some good information, especially the pending samples per shard. I am curious, what do last scrape and scrape duration mean?
Generally, I think a section for information about the queue is also necessary, the whole queue is either resharding or running, not individual shards. Each shard would be running, stopping, or stopped I think.
Yeah should be "last sent timestamp" and "last sent duration"
Might be nice to have a collapsible section for each overall queue that shows the queue config and shard information in a table?
I would say collapse the shard information by default, but if the queue information is reasonably compact and high level then that could remain prominently displayed for a quick scan?
Edit: I think I misread, you definitely say "queue config" which would be nice to have collapsed by default as well. :+1:
Has this task been picked up by someone?
Has this task been picked up by someone?
No, it is free to pick :)
I've picked this issue up, hoping I will have time to finish it in the next few days for review.
If you want to share a mock before doing the hard work, that would be welcome. The exemple I provided will need improvements.
And thanks for picking this :)
Just to confirm, I'm needing to add a new endpoint for this data to be accessible via the API. There's no issue introducing that with the same PR that implements the web UI?
It'd be unusual not to.
First rough mock out of the UI just throwing together some react components raises a few questions. It's a rough mock but any feedback on this would be appreciated.
For the values used as part of the sharding calculations we have two choices: we can show the values as they are currently, or as they were the last time the sharding calculation was run to calculate a desired value. Since this runs every ten seconds, it doesn't seem to make a huge difference, but we obviously need to make a choice here.
Within the QueueManager there's currently no concept of a "state" for the shards and queue. I can't tell whether or not it is worth introducing this in order to service this UI? The only value I think that would add significant value is a "Resharding" status shown somewhere for the queue.
That looks nice. I fell the error message could be in red like in the targets page. Also, I would expect the name of the remote read as title and the URL as part of the config displayed.
Nice work, thanks for putting this together!
First, to answer your questions:
A couple of additional comments/thoughts:
Note: my expectation is that 99% of remote write users would have just one remote, right? That might help in the design to think about that use case first (when it comes to titles etc).
Nice work, thanks for putting this together!
First, to answer your questions:
- I think last calculation run is nice along with the time in which it was run, but don't have strong opinions either way.
- I agree with roidelapluie as to state, we just need some indicator as to whether the last request was an error or not.
A couple of additional comments/thoughts:
- I would like the name of the remote write and the URL in the title, otherwise if it is an unnamed queue the hash by itself is not very useful. You might already have this if "Primary Remote" is the name, but I want to make sure :)
- The shard info could be moved down into the calculation area, I think a more interesting headline would be the delay and status.
Yeah, the title is currently Name and then the Endpoint. I observed the hash behaviour myself last week :)
I think I agree with your suggestion regarding the shard info since most of those figures aren't immediately eye-catching, delay + status makes a bit more sense here.
I've started with pulling the data through from the QueueManager to the API, but feel like I may need to chat with you on irc sometime in the next few days to confirm some bits since I'm not too sure on the house preferences .
I'm in the middle of preparing to transition to a new role with a new employer, so I may need to take a short break from this task.
https://github.com/prometheus/prometheus/pull/8218
Perhaps move discussion here.
Is this issue still open? I'd like to take it up.
Yes. Might be a good idea to take a look at the previously opened PR. https://github.com/prometheus/prometheus/pull/8218
Hi @roidelapluie if this issue is still open, I can take this up!
AFAIK no one has worked on this, so feel free to. The previous attempt at an implementation plus the above comments should help get you started.
Hey guys i saw there is no PR in this issue and i would liked to work on it .
Hi @kushalShukla-web, sounds good! One note: I'm currently working on a new Prometheus UI (still based on React, but using https://mantine.dev/ as a UI framework instead of Bootstrap). It might make more sense to already add this remote-write page there instead of to the old web UI.
Currently the new web UI lives only in the mantine-ui
branch (https://github.com/prometheus/prometheus/tree/mantine-ui), and you can find the code for it at https://github.com/prometheus/prometheus/tree/mantine-ui/web/ui/mantine-ui. In that branch, both the old and the new UIs are built into the Prometheus binary, and you can enable the new UI by providing a feature flag:
./prometheus --enable-feature=new-ui
The new UI is still very early (for example, there is no graphing yet, only a table view), but the idea is to get it ready this year so it can become the new default UI. Then we can eventually retire the old one.
The overall goal with this new UI is to make both the code base and its dependencies more modern, and to make the UI look way less ugly and cluttered.
hey hi @juliusv i would love to work on this new UI of prometheus , where can i find the issues for these new UI ?
@kushalShukla-web Sorry for the late reply! We don't have specific issues for the new UI yet, but I would just consider any non-urgent UI feature request an issue for the new UI as of now :)
Proposal
Use case. Why is this important?
I am starting with remote_write, and from the Prometheus UI, all I can see is the URL.
I'd like new remote write users like me to easily find out in the UI:
It appears that we could also have the status of the queues: starting, resharding, ...
Let's brainstorm what would be the added value for this, and gather ideas.
Note: This is not a meta-issue; the end goal is to have a remote-write page in the UI.
cc @cstyan @csmarchbanks @bwplotka @juliusv
Note: This must be implemented in the new UI