hazelcast / hazelcast

Hazelcast is a unified real-time data platform combining stream processing with a fast data store, allowing customers to act instantly on data-in-motion for real-time insights.
https://www.hazelcast.com
Other
6.07k stars 1.83k forks source link

Metric descriptor limit of 255 is too small #17901

Open lprimak opened 3 years ago

lprimak commented 3 years ago

Describe the bug While testing, I ran across the below Hazelcast log. My object name, while long, is only 134 characters long, not 255, well within the limit.

Nov 28, 2020 7:49:01 PM com.hazelcast.internal.metrics.managementcenter.ManagementCenterPublisher
WARNING: [192.168.99.1]:5701 [dev] [4.1] Too long value in the metric descriptor found, maximum is 255: Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count@Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count

What Hazelcast seems to be doing is concatenating names of CP subsystem variables with @ in between, i.e. MyAtomicLong becomes MyAtomicLong@MyAtomicLong for metrics' purposes. As a result of this, long names are more than doubled in size, so if the original name isn't close to 255 characters, the resulting 'internal' name is going to exceed that limit.

Either the limit needs to increase or size of the names shouldn't be doubled.

Expected behavior No warning should appear

To Reproduce

Additional context

[INFO] Running com.hazelcast.cp.internal.datastructures.atomiclong.MetricTooLongTest
*** Name length: 134, name = Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count
Nov 28, 2020 7:49:01 PM com.hazelcast.internal.metrics.managementcenter.ManagementCenterPublisher
WARNING: [192.168.99.1]:5701 [dev] [4.1] Too long value in the metric descriptor found, maximum is 255: Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count@Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count
mmedenjak commented 3 years ago

Thank you for the report, @blazember @emre-aydin can you take a look?

blazember commented 3 years ago

Related: https://github.com/hazelcast/hazelcast/pull/16869 The issue report is right. We'll need to move on from the single byte used for encoding the word size to 2 or 4 bytes. Or we should use a smarter variable-length encoding (I'd take this way, we'll see). This means a binary compatibility change (client->member, member->MC). I'll take a look at this when I'm back in January.

Thanks for reporting this @lprimak :+1:

mmedenjak commented 3 years ago

@blazember can this issue still make it into 4.2? How much effort do you estimate?

lprimak commented 1 year ago

Please bump

lprimak commented 1 year ago

@vbekiaris @JamesHazelcast for your reference :)

qzagarese commented 5 months ago

Bumping this up. Any suggestion to avoid/workaround ? Thanks in advance.

JamesHazelcast commented 5 months ago

Although not a fix for the underlying issue, your particular issue involving the CP subsystem can be somewhat alleviated:

As you've pointed out @lprimak, the metrics id is being doubled in length in a name@name manner for your CP object here. This actually only takes place when the CP subsystem is operating in unsafe mode with no explicit group ID provided, which falls back to using partition based Raft groups; this in tern uses the proxy name (in your case Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count) as the group name, resulting in group@id being id@id in this scenario. The code for this can be seen here: https://github.com/hazelcast/hazelcast/blob/master/hazelcast/src/main/java/com/hazelcast/cp/internal/RaftService.java#L1062-L1091

This can be alleviated by explicitly providing a group ID when you create the proxy, removing the need for the Raft service to generate a group name (also operating in safe mode with CP subsystem fully engaged will mean the default default group ID is used in this scenario). This creation of your proxy results in an id value that is able to be handled by metrics:

String group = "my/group/id";
String name = "Payara/ejb/singleton/66493e0a-ae18-42cc-8b41-e8b2633c09f0_/66493e0a-ae18-42cc-8b41-e8b2633c09f0/ClusteredSingletonInterceptedEJB/count";
Hazelcast.newHazelcastInstance(new Config().setProperty("hazelcast.logging.type", "jdk")).getCPSubsystem().getAtomicLong(group + "@" + name).set(5);

If your issue also relates to this quirk of the CP subsystem, this may be able to help you @qzagarese.

qzagarese commented 5 months ago

Thank you @JamesHazelcast, much appreciated!