microsoft / cpp_client_telemetry

1DS C++ SDK
Apache License 2.0
87 stars 49 forks source link

oxc0000409 (stack buffer overflow) Microsoft::Applications::Events::time_ticks_t::time_ticks_t #325

Open Fitzpasd opened 4 years ago

Fitzpasd commented 4 years ago

We are hitting this error frequently. Our usage of the sdk hasn't changed since I opened #203, other than to disable the network detector as per our Teams convo.

maxgolov commented 4 years ago

@Fitzpasd - do you have a local repro of this issue? Searching WATSON for hits including time_ticks_t - I only see your process, not any other app or service using 1DS C++ SDK. No hits in Office nor Edge, which generate considerably larger (x100+) traffic volumes.

maxgolov commented 4 years ago

Memory Corruption

Stack seems to be corrupted. The issue that leads to stack corruption may not necessarily be in the SDK. time_ticks_t on stack does not give much clues at this point.

Given you turned off the Network Detector --- ILogManager::DispatchEventBroadcast(evt) , appears on stack, may be called if you specify event type or event property name that does not conform to regexp here: https://github.com/microsoft/cpp_client_telemetry/blob/780205d2ea0298e41e82d54a3d203366f051cdf4/lib/utils/Utils.cpp#L177

The checks are performed here: https://github.com/microsoft/cpp_client_telemetry/blob/1e3c61285d953975c4a64f22fd1d611a1e474dd9/lib/system/EventProperties.cpp#L261

And here: https://github.com/microsoft/cpp_client_telemetry/blob/1e3c61285d953975c4a64f22fd1d611a1e474dd9/lib/system/EventProperties.cpp#L290

This may happen because of memory corruption from elsewhere, i.e. you expect that you pass valid property name or type, but in fact the buffer passed to SDK is already corrupted.

Note the stack trace does not make much sense, so it might be a red herring entirely.

Configuration

Since your code is running in a system service, it's worthwhile checking that you're using UTC transport. I see where you perform MAE::LogManager::Initialize call, but I don't see where you're initializing it in UTC mode. To initialize UTC mode - means using UTC background telemetry service instead of WinInet running in proc, you have to add this BEFORE you initialize:

    auto configuration = LogManager::GetLogConfiguration();
    configuration[CFG_INT_SDK_MODE] = SdkModeTypes::SdkModeTypes_UTCCommonSchema;

Currently running in direct upload mode inside a system service context (is xboxappservices.exe a system service?) -- isn't supported. WinInet.dll (our default http stack) is not supported for use in a system service context. There are known issues with that - see #322. You have to use Win 10 UTC mode for system services. That layer is sampling by default all of your events at 1-2%. But there are some further options to adjust your sampling rates at system-level. Windows Health DL can help with adjusting those sampling rates.