OpenLineage / OpenLineage

An Open Standard for lineage metadata collection
http://openlineage.io
Apache License 2.0
1.71k stars 299 forks source link

[FEATURE] - Support custom headers for Python http client #2907

Open kr-igor opened 1 month ago

kr-igor commented 1 month ago

Motivation

OpenLineage support Authorization http header, but some destinations may require a custom header. Specifically, Datadog requires DD_API_KEY to be set. It's not currently possible to set custom headers for the Python client.

Description

Proposal is to support OPENLINEAGE_HTTP_HEADERS: key=value;key1=value1 environment variable to set custom headers. The patch is below.

Index: client/python/openlineage/client/transport/http.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/client/python/openlineage/client/transport/http.py b/client/python/openlineage/client/transport/http.py
--- a/client/python/openlineage/client/transport/http.py    (revision 65d15773f88151a99858392a17706b1661c76e72)
+++ b/client/python/openlineage/client/transport/http.py    (revision 3f10f0fc874ae6d9bc9c99aa68ec4f6a973d35d2)
@@ -87,6 +87,8 @@
     session: Session | None = attr.ib(default=None)
     # not set by TransportFactory
     adapter: HTTPAdapter | None = attr.ib(default=None)
+    # extra http headers to use
+    headers: dict[str, str] = attr.ib(default={})

     @classmethod
     def from_dict(cls, params: dict[str, Any]) -> HttpConfig:
@@ -150,6 +152,8 @@
             self.session.headers["Content-Type"] = "application/json"
             auth_headers = self._auth_headers(config.auth)
             self.session.headers.update(auth_headers)
+            if len(config.headers) > 0:
+                self.session.headers.update(config.headers)
         self.timeout = config.timeout
         self.verify = config.verify
         self.compression = config.compression
Index: client/python/openlineage/client/client.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/client/python/openlineage/client/client.py b/client/python/openlineage/client/client.py
--- a/client/python/openlineage/client/client.py    (revision 65d15773f88151a99858392a17706b1661c76e72)
+++ b/client/python/openlineage/client/client.py    (revision 3f10f0fc874ae6d9bc9c99aa68ec4f6a973d35d2)
@@ -228,6 +228,13 @@
         if endpoint is not None:
             config.endpoint = endpoint

+        headers_str = os.environ["OPENLINEAGE_HTTP_HEADERS"]
+        if headers_str is not None:
+            for val in headers_str.split(","):
+                var = val.split("=")
+                if len(var) == 2:
+                    config.headers[var[0]] = var[1]
+
         return HttpTransport(config)

     @staticmethod
Index: client/python/tests/test_client.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/client/python/tests/test_client.py b/client/python/tests/test_client.py
--- a/client/python/tests/test_client.py    (revision 65d15773f88151a99858392a17706b1661c76e72)
+++ b/client/python/tests/test_client.py    (revision 3f10f0fc874ae6d9bc9c99aa68ec4f6a973d35d2)
@@ -368,7 +368,7 @@

 @patch.dict(
     "os.environ",
-    {"OPENLINEAGE_URL": "http://example.com", "OPENLINEAGE_ENDPOINT": "v7", "OPENLINEAGE_API_KEY": "xxx"},
+    {"OPENLINEAGE_URL": "http://example.com", "OPENLINEAGE_ENDPOINT": "v7", "OPENLINEAGE_API_KEY": "xxx", "OPENLINEAGE_HTTP_HEADERS": "a=b,c=d"},
 )
 def test_http_transport_from_env_variables() -> None:
     transport = OpenLineageClient._http_transport_from_env_variables()  # noqa: SLF001
@@ -376,6 +376,7 @@
     assert transport.url == "http://example.com"
     assert transport.endpoint == "v7"
     assert transport.config.auth.api_key == "xxx"
+    assert transport.config.headers == {"a": "b", "c": "d"}

 def test_http_transport_from_url_no_options() -> None:
Index: client/python/tests/test_http.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/client/python/tests/test_http.py b/client/python/tests/test_http.py
--- a/client/python/tests/test_http.py  (revision 65d15773f88151a99858392a17706b1661c76e72)
+++ b/client/python/tests/test_http.py  (revision 3f10f0fc874ae6d9bc9c99aa68ec4f6a973d35d2)
@@ -65,6 +65,7 @@
             "session": session,
         },
     )
+    config.headers["a"] = "b"
     transport = HttpTransport(config)

     client = OpenLineageClient(transport=transport)
@@ -81,7 +82,7 @@
     transport.session.post.assert_called_once_with(
         url="http://backend:5000/api/v1/lineage",
         data=Serde.to_json(event),
-        headers={},
+        headers={"a": "b"},
         timeout=5.0,
         verify=True,
     )

ol.patch

Related issues

No response

Do you plan to make this contribution yourself?

boring-cyborg[bot] commented 1 month ago

Thanks for creating your first OpenLineage issue! Your feedback is valuable and improves the project. If you haven't already, please be sure to follow the issue template!

kacpermuda commented 1 month ago

Hey, thanks for the interest in OpenLineage. FYI, you do not need to create an issue if you have a change ready to submit! You can open a pull request immediately instead. Even if you are not sure if it's something that will end up being added to the code, it's always easier to discuss over actual implementation 😄

mobuchowski commented 1 month ago

@kr-igor I've tried to find a solution to the problem more generically in https://github.com/OpenLineage/OpenLineage/issues/2916

mobuchowski commented 1 month ago

@kr-igor Also - really thanks for starting the discussion and opening PR. You just hit a place where we have larger issue we need to solve :)