dynatrace-oss / OneAgent-SDK-Python-AutoInstrumentation

autodynatrace, a python library that implements automatic instrumentation using the OneAgent SDK for Python
Other
62 stars 28 forks source link

Auto-instrument Thread #56

Closed amarlot closed 2 years ago

amarlot commented 3 years ago

Hello,

We are using a lot of threading inside to call in parallel a lot of externals apis/partner.

Autodynatrace detect those calls but not inside the main purepath but inside image

I would like to have those calls inside the "main purepath" and not another services "Requests executed in background thread".

I tried to change my application code like you did in this blog post https://www.dynatrace.com/news/blog/optimizing-python-code-during-development/ and following those sample : https://github.com/Dynatrace/OneAgent-SDK-for-Python/blob/fa4dd209b6a21407abca09a6fb8da1b85755ab0a/samples/basic-sdk-sample/basic_sdk_sample.py but it was not sucessful.

Here is how we create our thread :

threads_list = list()
    for endpoint, crawler in crawls.items():
        t = Thread(target=lambda f, arg1: f.update({endpoint: append(crawler, email, mock_header)}), args=(futures, crawler)) 
        threads_list.append(t)
        t.start()

    for t in threads_list:
        t.join()

Thanks. Alexandre

dlopes7 commented 2 years ago

Hello,

You can't easily instrument a threading.Thread directly, because it doesn't "own" the lambda function that you are running there. This is why we only instrument ThreadPoolExecutor, as the function running is aware of the ThreadPoolExecutor

Instead, what you need to do is:

1 - Create the link before you create the threads 2 - Move your lambda function definition to a standalone function (optional) 3 - Pass the link object to the lambda function 4 - Call sdk.trace_in_process_link inside the lambda function to "link" main with this function call

import time
from threading import Thread
import autodynatrace
import oneagent

sdk = oneagent.get_sdk()

def do_task(futures, crawler, dynatrace_link):
    with sdk.trace_in_process_link(dynatrace_link):
        # Your lambda code here
        time.sleep(2)
        print(f"Done crawling {futures}, {crawler}")

@autodynatrace.trace()
def main():
    crawls = {
        "https://www.google.com": "crawler-01",
        "https://www.dynatrace.com": "crawler-02",
    }
    threads_list = list()

    link = sdk.create_in_process_link()
    for endpoint, crawler in crawls.items():
        t = Thread(target=do_task, args=(endpoint, crawler, link))
        threads_list.append(t)
        t.start()

    for t in threads_list:
        t.join()

if __name__ == "__main__":
    main()