Closed gilles-leblanc closed 1 year ago
If my proposed solution is deemed acceptable, I could submit a pull request if you require.
@gilles-leblanc Thanks for reporting this, and thank you for the detailed explanation of the issue. That made it possible for a non-Python expert like me to understand what is going on 👍
I think that your suggestion makes sense. However; I think that if we're going to make a change to this, that I would prefer that we also make the logging of the actual query parameters itself optional. Query parameters could contain sensitive data, and I can imagine users wanting to trace queries, but not the query parameters of those queries to prevent sensitive data leaking into logs that can be viewed by different people.
Would you be up for creating a pull request that does both? So your suggestion + making the inclusion of query parameters in the tracing optional. Suggestion for requirements:
SQLALCHEMY_SPANNER_TRACE_HIDE_QUERY_PARAMETERS
(set to true
to disable query parameter logging)trace_call
function. Remove the db.params
attributes before tracing if the above option has been set.db.params
and not for all extra_attributes
that happen to be an instance of Sequence
. That prevents any surprises if someone later adds other trace attributes that are also a Sequence
, and where all the elements of that Sequence
happen to be of the same (non-string) type.Yes I'll do it like you suggested. I need to get approval from my employer to do so. I've sent in the request.
Submitted a PR under my work account which is under the CLA.
We have an issue where integration between this package and OpenTelemetry is generating a lot of warnings.
In google.cloud.sqlalchemy_spanner
sqlalchemy_spanner.py
when it executesdo_execute
it callstrace_call
but before doing so it sets thetrace_attributes
includingdb.params
which are the values of the SQL query.In
trace_call
, it checks ifHAS_OPENTELEMETRY_INSTALLED
and if so it will add thesedb.params
atttributes to the trace and performs the trace.But, in the OpenTelemetry Python package, it will check all attributes. One of the check that is performed on the attributes is that if an attribute is a Sequence type, all elements of the sequence must be of the same type. If not, it will throw a warning. This checks happens in the Python OpenTelemetry (external to Google) package in file
attributes/__init__.py
in the function_clean_attribute
(see checks and _logger.warning calls).The issue is that the Google Spanner SQLAlchemy package will store it’s
db.params
in the various types that they represent (which is good on it's own). It will then send these to the OpenTelemetry package who will check they’re all of the same type and throw a warning if not. So for all SQL queries that contain multiple datatype, OpenTelemetry will generate a warning.This results in a huge number of these warnings in our GCP Log Explorer. This could be fixed in a number of ways including sending these values as strings when sending them to OpenTelemetry.
I've made some tests and if I change this package
sqlalchemy_spanner/_opentelemetry_tracing.py
file to use the following code:I'm no longer receiving any warnings as all sequences values are converted to string before tracing.