open-telemetry / opentelemetry-python

OpenTelemetry Python API and SDK
https://opentelemetry.io
Apache License 2.0
1.8k stars 625 forks source link

`opentelemetry` + `streamlit` = `Overriding of current TracerProvider is not allowed messages` and `Attempting to instrument while already instrumented` #3743

Open gorkaerana opened 8 months ago

gorkaerana commented 8 months ago

Hello,

We have a Streamlit app in production and we are hoping to add observability to it by leveraging opentelemetry.instrumentation.logging.LoggingInstrumentor to integrate the current logging and OpenTelemetry to publish logs to Azure Monitor. I believe Streamlit executing the same piece of code more than once at runtime results in Overriding of current TracerProvider is not allowed messages, which in turn only publishes Attempting to instrument while already instrumented to Azure Monitor. I have tried to circumvent the issue the following few ways:

Please do let me know if this is not the place to ask such questions. I tried asking in the Streamlit forum but I have received no answers so far. Any help would be very much appreciated 😊

Thank you in advance, Gorka.

import logging
import os
from pathlib import Path

import opentelemetry
import streamlit as st
import streamlit_authenticator as stauth
import yaml
from azure.monitor.opentelemetry import configure_azure_monitor
from dotenv import load_dotenv
from opentelemetry import trace
from opentelemetry.instrumentation.logging import LoggingInstrumentor
from streamlit_calendar import calendar
from yaml.loader import SafeLoader

def user_config() -> tuple[str, bool, str, stauth.Authenticate]:
    here = Path(__file__).parent
    users_yaml_path = here / "src" / "frontend" / "users.yaml"

    with open(users_yaml_path) as file:
        config = yaml.load(file, Loader=SafeLoader)

    authenticator = stauth.Authenticate(
        config["credentials"],
        config["cookie"]["name"],
        config["cookie"]["key"],
        config["cookie"]["expiry_days"],
        config["preauthorized"],
    )

    name, authentication_status, username = authenticator.login("Login", "main")

    return name, authentication_status, username, authenticator

def logged_in(authenticator: stauth.Authenticate, username: str) -> None:
    with st.sidebar:
        st.write(f'Welcome *{st.session_state["name"]}*')
        vaults_markdown_list = "\n".join(
            f"- {vault_name}"
            for vault_name in authenticator.credentials["usernames"][username][
                "vaults"
            ].split(";")
        )
        st.write(f"You have access to these vaults:\n{vaults_markdown_list}")
        authenticator.logout("Logout", "main")

def not_logged_in() -> None:
    if not st.session_state["authentication_status"]:
        st.error("Username/password is incorrect")
        st.markdown(
            """
        <style>
            section[data-testid="stSidebar"][aria-expanded="true"]{
                display: none;
            }
        </style>
        """,
            unsafe_allow_html=True,
        )

    elif st.session_state["authentication_status"] is None:
        st.warning("Please enter your username and password")
        st.markdown(
            """
        <style>
            section[data-testid="stSidebar"][aria-expanded="true"]{
                display: none;
            }
        </style>
        """,
            unsafe_allow_html=True,
        )

# @st.cache_resource
def set_logger_and_tracer() -> tuple[logging.Logger, opentelemetry.trace.Tracer]:
    configure_azure_monitor(
        connection_string=os.environ["web_app_insights_connection_string"]
    )
    tracer = trace.get_tracer(__name__)
    LoggingInstrumentor(set_logging_format=True, log_level=logging.INFO).instrument()
    logger = logging.getLogger(__name__)
    return logger, tracer

load_dotenv()
name, authentication_status, username, authenticator = user_config()
if st.session_state["authentication_status"]:
    logger, tracer = set_logger_and_tracer()
    with tracer.start_as_current_span("span_name"):
        logger.info("Some logging")
        logged_in(authenticator, username)
        calendar_options = {
            "headerToolbar": {
                "left": "today prev,next",
                "center": "title",
                "right": "timeGridDay,timeGridWeek",
            },
            "slotMinTime": "00:00:00",
            "slotMaxTime": "24:00:00",
            "initialView": "timeGridWeek",
            "selectable": True,
            "selectMirror": True,
            "editable": True,
        }
        custom_css = """
            .fc-event-past {
                opacity: 0.8;
            }
            .fc-event-time {
                font-style: italic;
            }
            .fc-event-title {
                font-weight: 700;
            }
            .fc-toolbar-title {
                font-size: 2rem;
            }
        """

        new_event = calendar(
            events=[],
            options=calendar_options,
            custom_css=custom_css,
            callbacks=["dateClick", "eventChange", "eventClick", "eventsSet"],
        )
else:
    not_logged_in()
naisanzaa commented 7 months ago

+1

miguelaca commented 7 months ago

+1

pmcollins commented 7 months ago

Hi -- I wasn't familiar with Streamlit but it looks like it can re-run scripts from top to bottom in response to user interactions, reusing a single Python process. So you have to be careful with operations that are designed to happen only once per Python process, such as setting the global tracer provider or setting up an instrumentor.

lzchen commented 7 months ago

@gorkaerana

As @pmcollins has mentioned above, certain components of OpenTelemtry and in addition components of Azure monitor built on top of OpenTelemetry are not meant to be instantiated more than once. Besides ...Provider classes, we have no explicitly safeguards for this. I am also not too familiar with Streamlit but if it's anything similar to databricks notebooks in which multiple runs of the same code are done using the same process and memory is saved between runs, this will not be the intended use case for our SDKs. Even if you do find a way to cache the ...Provider classes, you will experience additional issues such as duplicate telemetry and every growing thread creation.

If you really want to get this scenario working, you must ensure that EACH component that is meant to be a singleton is used be only instantiated once. You have attempted to do this with tracer and logger, which actually can created more than once (usually tracers and loggers represent telemetry captured within a Python file since they are usually namespaced by __name__. The components you actually want to cache are the ...Provider classes from the OpenTelemetry side, as well as the exporter classes within Azure monitor. Now, you are using configure_azure_monitor(), which is an API of the Azure monitor distro, which is intended as a one-stop-shop solution that will instatiate everything for you using one api call. Now, while a simple api, this is probably not the component you want to be using for your use case, as we do not expose the underlying components that are being instantiated under the hood. You would be better off using the azure-monitor-opentelemetry-exporters, which the distro is built on top of. The components you need to cache are AzureMonitor<Trace/Log/Metrics>Exporter. Please keep in mind that this is a uncommon scenario and you might have to do some hacky stuff to get it working, including taking dependencies on beta libraries.

claorden commented 6 months ago

Hi @gorkaerana ,

I have set up the same environment you want by simply following the instructions here: https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=python

Then for Streamlit, use the cache_resource decorator: https://docs.streamlit.io/develop/api-reference/caching-and-state/st.cache_resource.

Example:

import streamlit as st
import logging
from azure.monitor.opentelemetry import configure_azure_monitor

@st.cache_resource
def _configure_logging():
    configure_azure_monitor()

def main() -> None:
    _configure_logging()
    logging.error(f"This is an error.")

Hope this helps.

gorkaerana commented 6 months ago

Hi @claorden,

Your minimal example does indeed work, thank you very much for your help 😃 I'll retrofit it to our code base one of these days and report back with the results.

Best, Gorka.