census-instrumentation / opencensus-python

A stats collection and distributed tracing framework
Apache License 2.0
669 stars 250 forks source link

Segmentation fault after calling py_Finalize() with python version >3.6 #1173

Closed gaurang-makwana closed 1 year ago

gaurang-makwana commented 2 years ago

Hi,

I am using ubuntu 18.04 LTS.

I am embedding python to C++ for uploading logs to azure application insights. My code worked well with python3.6 but now support is not available for python3.6 for python core team. So I am trying to use higher version of python for my code, but it causes segmentation fault when repetitive calls to py_Initialize() and py_Finalize() are made. If py_Finalize() is called only once, there is no crash, but the logs are not uploaded to cloud. I want to keep the application running.

Please refer the code as show below:

Steps to reproduce. Install Python & App Insight Dependencies a) sudo apt-get update b) sudo apt install python3.6 (or use higher version) c) python3 -V (Use to check python3 version) d) sudo apt-get install python3-dev e) sudo apt-get install libpython3.6-dev f) sudo apt-get install python3-pip h) sudo apt install rustc i) sudo -H pip3 install setuptools_rust g) sudo -H pip3 install opencensus-ext-azure applicationinsights

Code Sample:

include

include

include

include

include

void CallAppInsightUploadFunction();

int main() {

for (int i = 0; i <= 5; i++)
{
    Py_Initialize();
    CallAppInsightUploadFunction();
    std::cout << "Loop count: " + std::to_string(i) << std::endl;
    Py_Finalize();
}

printf("\nGood Bye...\n");
return 0;

}

void CallAppInsightUploadFunction() {

PyRun_SimpleString("import sys");

PyRun_SimpleString("if not hasattr(sys, 'argv'):  sys.argv  = ['']");

PyRun_SimpleString("import logging");

PyRun_SimpleString("from opencensus.ext.azure.log_exporter import AzureLogHandler");

PyRun_SimpleString("logger = logging.getLogger(__name__)");

PyRun_SimpleString("logger.addHandler(AzureLogHandler(connection_string='InstrumentationKey=<YOUR-INSTRUMENTATION-KEY>'))");
PyRun_SimpleString("logger.setLevel(logging.INFO)");
PyRun_SimpleString("logger.info('Testing AppInsight Uploads...')");

}

What is the expected behavior? The code given above is working fine with python version 3.6, but when I try to use higher version it crashes. The code crashes when second times the below mentioned line is called(with higher versions): PyRun_SimpleString("from opencensus.ext.azure.log_exporter import AzureLogHandler");

What is the actual behavior? I believe if it is working with python3.6 it should work with higher versions as well.

Additional context. Please help if anyone has faced similar issue.

lzchen commented 1 year ago

The code crashes when second times the below mentioned line is called(with higher versions): PyRun_SimpleString("from opencensus.ext.azure.log_exporter import AzureLogHandler");

Are you calling CallAppInsightUploadFunction() twice and that is when it crashes at that line?

gaurang-makwana commented 1 year ago

Yes that's correct.

It crashes in second iteration of the for loop at line: PyRun_SimpleString("from opencensus.ext.azure.log_exporter import AzureLogHandler");

lzchen commented 1 year ago

@gaurang-makwana Can you paste the stack trace with the error message?

gaurang-makwana commented 1 year ago

Hi @lzchen,

Please find the stack trace logs as shown below:

~/AppInsightTest2/build/bin$ sudo ./AppInsightTest Loop count: 0 context.c:55: warning: mpd_setminalloc: ignoring request to set MPD_MINALLOC a second time

Error: signal 11: ./AppInsightTest(_Z7handleri+0x2c)[0x557d37b320] linux-vdso.so.1(__kernel_rt_sigreturn+0x0)[0x7f869986b0] /usr/lib/python3.8/lib-dynload/_asyncio.cpython-38-aarch64-linux-gnu.so(PyInit__asyncio+0x398)[0x7f84da72a0] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(_PyImport_LoadDynamicModuleWithSpec+0x19c)[0x7f8658ece4] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(+0x1a2260)[0x7f8658f260] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(+0x254c9c)[0x7f86641c9c] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(PyVectorcall_Call+0x58)[0x7f866863f8] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x75d8)[0x7f86467b70] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0xa64)[0x7f865ac4cc] /usr/lib/aarch64-linux-gnu/libpython3.8.so.1.0(_PyFunction_Vectorcall+0x80)[0x7f86685b60] ~/AppInsightTest2/build/bin$

gaurang-makwana commented 1 year ago

Hi @lzchen,

I found solution for the issue I was facing.

Isuue as per my understandings: My concern was to upload data to cloud on every iteration.I was calling Py_Initialize() and Py_FinalizeEx() in order to do that.

But with python versions higher than 3.6, memory for external library was not getting freed correctly. That is why when it comes to the second iteration in for loop and tries to import it again it was causing segmentation fault.

Workaround that worked for me: Rather than using Py_Initialize() and Py_FinalizeEx() multiple times in a running application, I should use: PyRun_SimpleString("handler.flush()");

It pushes data to cloud at every iteration in the for loop. Now I can call Py_Initialize() and Py_FinalizeEx() only once in my code as the data are pushed to the cloud.

Note: It takes some time for your data to get reflected on the portal.

Output Log of modified code ~/AppInsightTest/build/bin$ sudo ./AppInsightTest Loop count: 0 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal Loop count: 1 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal Loop count: 2 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal Loop count: 3 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal Loop count: 4 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal Loop count: 5 Device is going to sleep for 120 seconds... Device wake-up from sleep... Please check the logs on web portal

Good Bye... ~/AppInsightTest/build/bin$

Logs on web portal image

gaurang-makwana commented 1 year ago

My modified code(which works) looks like this:

include

include

include

include

include

include

void CallAppInsightUploadFunction();

int main() { Py_Initialize();

PyRun_SimpleString("import sys");

PyRun_SimpleString("if not hasattr(sys, 'argv'):  sys.argv  = ['']");

PyRun_SimpleString("import logging");

PyRun_SimpleString("from opencensus.ext.azure.log_exporter import AzureLogHandler");

PyRun_SimpleString("logger = logging.getLogger(__name__)");
PyRun_SimpleString("handler = AzureLogHandler(connection_string='InstrumentationKey=<INSTRUMENTATION-KEY>')");
PyRun_SimpleString("logger.addHandler(handler)");
PyRun_SimpleString("logger.setLevel(logging.INFO)");

for (int i = 0; i <= 5; i++)
{
    CallAppInsightUploadFunction();
    std::cout << "Loop count: " + std::to_string(i) << std::endl;
    // Uncomment following lines to verify that logs are being uploaded when the application is running
    std::cout << "Device is going to sleep for 120 seconds..." << std::endl;
    sleep(120);
    std::cout << "Device wake-up from sleep..." << std::endl;
    std::cout << "Please check the logs on web portal" << std::endl;
}
Py_FinalizeEx();

printf("\nGood Bye...\n");
return 0;

}

void CallAppInsightUploadFunction() { PyRun_SimpleString("logger.info('Testing AppInsight Uploads...')"); PyRun_SimpleString("handler.flush()"); }

lzchen commented 1 year ago

@gaurang-makwana Nice. I'm glad you got it working. I'll be closing this issue.