microsoft / cpp_client_telemetry

1DS C++ SDK
Apache License 2.0
85 stars 48 forks source link

OfflineStorage_Room::GetAndReserveRecords crash in Android #1227

Open FreddyJD opened 5 months ago

FreddyJD commented 5 months ago

Describe your environment. Latest 1DS C++ SDK v3.7.62.1

Steps to reproduce. No repro just a crash report from AppCenter from our Android app.

What is the expected behavior? N/A

What is the actual behavior? N/A

Additional context. This is the crash in appcenter

libmaesdk.so
Microsoft::Applications::Events::OfflineStorage_Room::GetAndReserveRecords(std::__ndk1::function<bool (Microsoft::Applications::Events::StorageRecord&&)> const&, unsigned int, Microsoft::Applications::Events::EventLatency, unsigned int)
libmaesdk.so
Microsoft::Applications::Events::StorageObserver::handleRetrieveEvents(std::__ndk1::shared_ptr<Microsoft::Applications::Events::EventsUploadContext> const&)
libmaesdk.so
Microsoft::Applications::Events::TransmissionPolicyManager::uploadAsync(Microsoft::Applications::Events::EventLatency)
libmaesdk.so
Microsoft::Applications::Events::PlatformAbstraction::WorkerThread::threadFunc(void*)
libmaesdk.so
void* std::__ndk1::__thread_proxy<std::__ndk1::tuple<std::__ndk1::unique_ptr<std::__ndk1::__thread_struct, std::__ndk1::default_delete<std::__ndk1::__thread_struct> >, void (*)(void*), void*> >(void*)

We are not sending events right now just initializing:

override fun initialize(context: Context) {
        loadLibraries()
        setupHttpClient(context)
        setupOfflineRoom(context)
        initializeLogger()
    }

    private fun loadLibraries() {
        System.loadLibrary(TelemetryConstants.MAE_SDK_LIB_NAME)
    }

    private fun setupHttpClient(context: Context) {
        HttpClient(context.applicationContext)
    }

    private fun setupOfflineRoom(context: Context) {
        OfflineRoom.connectContext(context.applicationContext)
    }
anod commented 5 months ago

Do you use main thread to initalize Room?

FreddyJD commented 5 months ago

Hi @anod It seems so, yes. Is this issue causing crashes for the users? It only affects a small portion of our user base, and we've followed the setup in the repository.

After reviewing the documentation, it appears that Room, HTTP, and the library need to be initialized before we can initialize the loggers.

image
anod commented 5 months ago

Yes, initialization looks correct, looking on stack, the crash is happening during uploading an event when it tires to load records from Room DB

Is it possible to add line number to stack trace nd error message? Do you know if the crash is happening during a specific lifecycle: openning app or closing?

FreddyJD commented 5 months ago

@anod App Center reports the only function names where the crash occurs (from native crashes atleast). It appears to be random. Also we are not sending events right now; we are just initializing. So, the issue could be occurring after a background notification, while using the app, or after some time has passed (reporting session times are random)

Question to clarify, is this function called only during the creation of the Room database? Seems that is happening with an Observable that the library sets*

FreddyJD commented 5 months ago

To add more context, we are creating 3 loggers currently.

private fun getLogConfiguration(apiKey: String, teamName: String): ILogConfiguration {
        val config = LogManager.getLogConfigurationCopy().apply {
            set(LogConfigurationKey.CFG_STR_PRIMARY_TOKEN, apiKey)
            set(LogConfigurationKey.CFG_STR_FACTORY_NAME, teamName)
            set(LogConfigurationKey.CFG_STR_FACTORY_HOST, teamName)
        }
        return config
    }

    private fun initializeLogger() {
        if (loggerAuthApp == null) {
            val logConfigAuthApp = getLogConfiguration(TelemetryClientType.AUTHAPP.apiKey, "authapp")
            loggerAuthApp = LogManagerProvider.createLogManager(logConfigAuthApp)
                .getLogger(TelemetryClientType.AUTHAPP.apiKey, "authapp", "")
        }
        if (loggerPIM == null) {
            val logConfigPIM = getLogConfiguration(TelemetryClientType.PIM.apiKey, "pim")
            loggerPIM = LogManagerProvider.createLogManager(logConfigPIM)
                .getLogger(TelemetryClientType.PIM.apiKey, "pim", "")
        }
        if (loggerDID == null) {
            val logConfigDID = getLogConfiguration(TelemetryClientType.DID.apiKey, "did")
            loggerDID = LogManagerProvider.createLogManager(logConfigDID)
                .getLogger(TelemetryClientType.DID.apiKey, "did", "")
        }
    }
anod commented 4 months ago

The trace posted is trying to uploadAsync something, looking in the source, the flow looks following: scheduleUpload >> uploadAsync >> initiateUpload >> retrieveEvents >> handleRetrieveEvents >> GetAndReserveRecords.

May be some stats info is being collected, not familiar with it, you can try to disable stats via configuration and see if it helps

The code crashes when accessing Room DB via GetAndReserveRecords, I would also suggest to verify that:

we use the following code, to create multiple loggers:

            val config = LogManager.logConfigurationFactory().apply {
                set(LogConfigurationKey.CFG_STR_PRIMARY_TOKEN, token)
                set(LogConfigurationKey.CFG_STR_COLLECTOR_URL, collectorUrl)
                set(LogConfigurationKey.CFG_STR_FACTORY_NAME, name + UUID.randomUUID())
            }

            val logManager = LogManagerProvider.createLogManager(config)
            return logManager.getLogger(token, "", "")