open-telemetry / opentelemetry-dotnet-contrib

This repository contains set of components extending functionality of the OpenTelemetry .NET SDK. Instrumentation libraries, exporters, and other components can find their home here.
https://opentelemetry.io
Apache License 2.0
445 stars 271 forks source link

I want to log bigger strings than 16 KiB to Geneva #873

Open sandersaares opened 1 year ago

sandersaares commented 1 year ago

I want to log bigger string fields than 16 KiB. However, the MsgPack serializer used in the Geneva exporter appears to limit all strings to 16 KiB: https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Exporter.Geneva/MsgPackExporter/MessagePackSerializer.cs#L66

The MsgPack spec says that 4 GiB is the limit, so I do not understand the reason for such a small limit. I would like to log as much data as I choose (or as much as the Geneva backend is capable of accepting, for which the hearsay is that it is considerably greater than 16 KiB).

utpilla commented 1 year ago

@sandersaares Could you please talk more about this logging requirement of yours? GenevaExporter allows up to 16383 UTF-16 characters for a string field which is pretty generous for most use cases. We would like to better understand your use-case and the reason behind logging such long strings.

sandersaares commented 1 year ago

The idea is to log logical events instead of treating the logging system as a file that we might dump lines into. By modeling our logs as events, we have found that it becomes significantly easier to analyze, especially when trying to get an aggregate picture of many such events.

One "event" in the context of our app might include, for example, connecting to 1000 different URLs to identify which ones are reachable and which are not - we would like to have one field in our event with a list of successful URLs, another with a list of failed URLs. However, with 16 KB, not many of those URLs will fit in there before we run out of room.

In practice, of course, we would have both - some log tables for individual lines, other log tables for logical events. Today, the field size limits rather constrain what we can do with the latter.

sandersaares commented 1 year ago

@utpilla what do you think about my scenario?

cijothomas commented 1 year ago

The 16 KiB limit of string values is more artificial, so could be fixed?

However, there is still the limit of 65360 bytes per event total, which is not something easy to fix. (its originally due to the ETW event size limitations)

Would removing the restriction on the individual string field cause (or increase probability of) your total event to exceed size (leading to full event loss) ?

sandersaares commented 1 year ago

Not sure about the probability of exceeding the 65360 bytes. It is already an uncomfortably low number as we use multicolumn events. Most are small length, granted, but I think the 65360 limit is already problematic for us on robustness grounds alone.

Is the 65360 limit only because of ETW? Because my code is all running on Linux, so no ETW is even involved - if this limit only applies to ETW and we can get rid of this limit on Linux, it would be a massive step forward in usability for our scenarios (and decrease the desire to switch our logging to e.g. ADX that has no such limits).

When I asked in Geneva channel some months ago, I was told Geneva itself can accept much bigger events (megabytes).

chrishdmicrosoftcom commented 1 year ago

We are also sometimes hit by this limit.

Our scenario is that we emit full SQL queries to telemetry, which we then later use to look up in SQL DMVs in order to troubleshoot performance problems. Most SQL queries are much shorter than 16K, but some of them are larger, especially many auto-generated queries.

In our case, I would not expect the 64K total limit to become a problem.

cijothomas commented 1 year ago

Is the 65360 limit only because of ETW? Because my code is all running on Linux, so no ETW is even involved - if this limit only applies to ETW and we can get rid of this limit on Linux,

Linux user_events (which will be eventually used by GenevaExporter in Linux), has the same limit as ETW.

neilgompf commented 1 year ago

We are also running into this limit.

We have a string field that generally maxes out around ~24KB but the log's total size across all fields is ~34KB. In our case it seems like this limit is unnecessarily truncating the 8KB from this field since there is potentially ~30KB of unused memory in the log.

Also do not see 64KB limit being an issue in our case.

cijothomas commented 8 months ago

The 64 KB overall limit is something that'll not be lifted as it is coming from the underlying OS limitation. The 16 KB per field limit seems like artificial (to protect the overall size from hitting the limit), could be relaxed. If this is relaxed, need to closely check the exception stack part as it can now potentially cause whole event to be dropped, so need to revisit that.

Please continue to upvote + explain the scenario here, so this issue can prioritized.

geraldvindas commented 4 months ago

Hi team,

We're encountering an issue with a limit in our logging system. Currently, we're trying to log a value of 18KB, but due to a limitation, the value gets truncated to 16KB minus 4 characters. I'm exploring possible workarounds, but it seems like the best solution would be to officially address this in the library.

I'm curious about the effort involved in increasing this limit. From what I've seen in the code, it appears to be a matter of adjusting one constant value:

private const int STRING_SIZE_LIMIT_CHAR_COUNT = (1 << 14) - 1; // 16 * 1024 - 1 = 16383

However, I understand that making this change might have other impacts throughout the system.

If the fix is straightforward, I'd be happy to volunteer to assist with implementing it.

Looking forward to your thoughts on this.

Thanks,

cijothomas commented 4 months ago

@geraldvindas We are not planning to remove the limit yet, so not ready to accept a PR for that. Still monitoring the actual demand before making the change.

GenevaExporter, by design, will be limited by its underlying platform limitations. You may also want to look at using OTLP Exporters, which do not have these restrictions.