Closed mrcnski closed 8 months ago
Thanks for the issue @mrcnski; yeah, I agree that that's less than ideal! Possibly the string cleaning bit can be done purely in the UI. Re the CPU and memory etc, I'd have to remember what substrate provides; maybe we need to tweak something upstream, or maybe we can just add eg the 64gb bucket here :)
To provide more insightful information, I've created a PR that handles the parsing in the backend entirely:
5.10.0-8-02354235
-> 5.10.0
to avoid grouping the patch level and generic string / commit of the kernelOne downside of this is that the full kernel version is no longer visible to the end user; however, the upside is that we group kernels more uniformly by their version. In other words, we'll display more kernel version numbers by loosening the grouping criteria.
We could send the list of all kernel versions and CPU models to the frontend. Then, let the frontend handle all the parsing of data. However, we would want to have a bounded limit on the lists, say 100 kernel versions to not overload the frontend.
In this case, it might be possible to have ~20 entries with the same kernel version (5.10.0) but different patch versions or commits; and by that heuristic, we'll probably end up displaying 5 unique kernel versions
For the CPU Vendor we could update the substrate binary to also provide this string, which could probably lead to more vendors being discovered. However, I believe we can build up on the proposed solution and expand it in the future if we decide it's important enough.
For the memory, we could add more buckets between the [64; 128) GiB if we decide that doesn't offer enough granularity. For more details check the stream hardware survey.
@mrcnski would love to hear your thoughts on this 🙏
Thanks @lexnv! I'm just wondering, what are the pros and cons of sending all the data to the frontend? It might be useful for the frontend to have all the stats. I don't see why 100 versions would overload it, that doesn't seem like a lot - how many nodes are sending the data?
That aside, the proposal sounds really good to me!
The downside of sending all the data to the frontend is that we'll probably have a very large payload. Potentially up the number of nodes in the network, presuming the nodes are targeting the substrate telemetry endpoint. That could lead to quite a large number of entries. However, to make an informed decision we'd probably have to inspect the telemetry core entries / deploy a new telemetry core in beta / rococo.
The upside would be that we have access to all the data directly in frontend. That would lead to the frontend performing the parsing and optionally displaying extra information (possibly all the information) of kernel versions / CPU names etc
Thanks for taking a look at this!
That makes sense! Just to make sure I understand the architecture correctly: the nodes send their data to the telemetry endpoint (the backend), which then sends the aggregated payload to the frontend?
That makes sense! Just to make sure I understand the architecture correctly: the nodes send their data to the telemetry endpoint (the backend), which then sends the aggregated payload to the frontend?
That's right, yup :)
ISSUE
Overview
telemetry.polkadot.io is not always very helpful:
76% of validators are on kernel version "Other", which doesn't say much. :P
Proposal
Clean up the string (i.e. remove the
-148
part) before displaying, but allow users to view the full data somehow.Other buckets that could be improved:
60.63% | 1206 | Other
22.42% | 447 | Other
39.23% | 781 | At least 64 GB
(64 GB bucket is missing and I figure that's the most common one...)