googleapis / python-spanner-sqlalchemy

Apache License 2.0
38 stars 28 forks source link

Integration between Spanner SqlAlchemy and OpenTelemetry is generating a lot of warnings #345

Closed gilles-leblanc closed 8 months ago

gilles-leblanc commented 9 months ago

We have an issue where integration between this package and OpenTelemetry is generating a lot of warnings.

In google.cloud.sqlalchemy_spanner sqlalchemy_spanner.py when it executes do_execute it calls trace_call but before doing so it sets the trace_attributes including db.params which are the values of the SQL query.

In trace_call, it checks if HAS_OPENTELEMETRY_INSTALLED and if so it will add these db.params atttributes to the trace and performs the trace.

But, in the OpenTelemetry Python package, it will check all attributes. One of the check that is performed on the attributes is that if an attribute is a Sequence type, all elements of the sequence must be of the same type. If not, it will throw a warning. This checks happens in the Python OpenTelemetry (external to Google) package in file attributes/__init__.py in the function _clean_attribute (see checks and _logger.warning calls).

The issue is that the Google Spanner SQLAlchemy package will store it’s db.params in the various types that they represent (which is good on it's own). It will then send these to the OpenTelemetry package who will check they’re all of the same type and throw a warning if not. So for all SQL queries that contain multiple datatype, OpenTelemetry will generate a warning.

This results in a huge number of these warnings in our GCP Log Explorer. This could be fixed in a number of ways including sending these values as strings when sending them to OpenTelemetry.

I've made some tests and if I change this package sqlalchemy_spanner/_opentelemetry_tracing.py file to use the following code:

if extra_attributes:
        if isinstance(extra_attributes, dict):
            for k, v in extra_attributes.items():
                if isinstance(v, collections.abc.Sequence):
                    extra_attributes[k] = [str(e) for e in v]

        attributes.update(extra_attributes)

I'm no longer receiving any warnings as all sequences values are converted to string before tracing.

gilles-leblanc commented 9 months ago

If my proposed solution is deemed acceptable, I could submit a pull request if you require.

olavloite commented 8 months ago

@gilles-leblanc Thanks for reporting this, and thank you for the detailed explanation of the issue. That made it possible for a non-Python expert like me to understand what is going on 👍

I think that your suggestion makes sense. However; I think that if we're going to make a change to this, that I would prefer that we also make the logging of the actual query parameters itself optional. Query parameters could contain sensitive data, and I can imagine users wanting to trace queries, but not the query parameters of those queries to prevent sensitive data leaking into logs that can be viewed by different people.

Would you be up for creating a pull request that does both? So your suggestion + making the inclusion of query parameters in the tracing optional. Suggestion for requirements:

  1. Logging of query parameters is enabled by default (I would have preferred disabled by default, but as it is already there, we need it enabled by default to avoid making a breaking change)
  2. Logging of query parameters in tracing can be switched on/off with an environment variable: SQLALCHEMY_SPANNER_TRACE_HIDE_QUERY_PARAMETERS (set to true to disable query parameter logging)
  3. The option can be checked in the trace_call function. Remove the db.params attributes before tracing if the above option has been set.
  4. I would also prefer that the stringification of the other trace attributes is done only for db.params and not for all extra_attributes that happen to be an instance of Sequence. That prevents any surprises if someone later adds other trace attributes that are also a Sequence, and where all the elements of that Sequence happen to be of the same (non-string) type.
gilles-leblanc commented 8 months ago

Yes I'll do it like you suggested. I need to get approval from my employer to do so. I've sent in the request.

gillesleblanc commented 8 months ago

Submitted a PR under my work account which is under the CLA.