open-telemetry / opentelemetry-sqlcommenter

SQLCommenter components for various languages
Apache License 2.0
27 stars 12 forks source link

Plan for ci/infra/release #5

Open srikanthccv opened 3 years ago

srikanthccv commented 3 years ago

I am starting this discussion to consolidate our thoughts/ideas and prepare a plan ahead. I have a couple of questions in my mind that I think should be brought up. Do we eventually migrate all components from this polyglot repo to their respective SIGs main/contrib repos or does this remain as a stand alone project? (I am inclined towards for moving pieces slowly to main/contrib repos). Or is it a mix of both where certain parts remain in this and everything else will be integrated with instrumentations? If we have all or some parts of Sqlcommenter that lives in this repo we will have to take care of bunch of things including CI, Clear contribution guidelines, and release management. Alternatively we can make use already available infra from language SIGs. These are some things on top of my mind. Please share their thoughts and suggestions.

sjs994 commented 2 years ago

Thanks @lonewolf3739 for starting the discussion thread. All these questions are quite important and needs some good discussion.

This is the initial discussion that happened regarding bringing in SQLCommenter into OpenTelemetry: https://github.com/open-telemetry/community/issues/783.

I am new to OpenTelemetry. So, maybe will be missing few complexities the OpenTelemetry project might have. Will try to summarise my thought process.

I am more inclined towards having certain parts integrated into OTel and certain parts be in an independent repository.

  1. The instrumentation part seems to be better if integrated into respective SIGs. That would help in seamless integration without adding extra code to initialise SQLCommenter. There can be configurations to control whether the Database queries will be instrumented or not.
  2. Now, regarding the exporter part, maybe we should separate it.
    • The reason being there might be multiple (in-process/out-of-process) plugins/extensions which can be created to export metrics/tracing information from each of the DB engines. Adding them into OpenTelemetry would really bloat it up.
    • Also from my limited experience database code have their own set of complexities and may need a different set of testing mechanism, failure handling, handling performance based issues, etc.
srikanthccv commented 2 years ago

I am more inclined towards having certain parts integrated into OTel and certain parts be in an independent repository.

What parts do you think should be reside in this independent repository?

I will take python as an example to share my thoughts and believe same applies to other languages. As you rightly pointed out, instrumentation functionality is better integrated directly with instrumentations provided by opentelemetry. This leaves us with only few parts such as get_flask_info, get_opentelemetry_values, and generate_sql_comment. IMO it is an overkill to maintain them separately. In contrib repos we have separate package where we maintain cross instrumentation helpers, which I think is perfect place to move these definitions.

Now, regarding the exporter part, maybe we should separate it.

I am not quite sure I understand this part. Can you elaborate where do you think exporters come into picture?

sjs994 commented 2 years ago

At present we only have the instrumentation library only. I am in support to move those parts into opentelemetry instrumentations. Now, there were few concerns regarding bloating up of opentelemetry specs, which we need to resolve first.

I am not quite sure I understand this part. Can you elaborate where do you think exporters come into picture?

With the current state of the library, tracing and framework information is added as hints/comments into SQL queries. These needs to be exported out to be used by other tools to process and get useful data from this.

For example: in mysql a tool can be created to process slowquery.log and aggregate the slow queries, pin pointing which APIs are affecting the DB performance. These tools although are not there at present, it seems logical for them to be added later to get better observability.

srikanthccv commented 2 years ago

Probably I am missing something, How does this bloat up the spec? I can understand it will be a maintenance burden to official SIGs but that can be managed by adding the sqlcommenter-approvers/maintainers to CODEOWNERS (how they are being handled now).

in mysql a tool can be created to process slowquery.log and aggregate the slow queries, pin pointing which APIs are affecting the DB performance

I believe this is out of scope for this project and limited to collecting and transporting the telemetry. I think there already exists receivers in collector which help achieve this.

sjs994 commented 2 years ago

Probably I am missing something, How does this bloat up the spec? I can understand it will be a maintenance burden to official SIGs but that can be managed by adding the sqlcommenter-approvers/maintainers to CODEOWNERS (how they are being handled now).

This was one of the concern raised in the initial discussion: https://github.com/open-telemetry/community/issues/783#issuecomment-889327708 I am on the same page as you for moving these into individual repository. But i am a little new to open telemetry and at present I am going through the specs and code and trying to take down notes. Maybe we can create an initial draft of moving sqlcommenter into open-telemetry and start the discussion with the SIGs.

I believe this is out of scope for this project and limited to collecting and transporting the telemetry. I think there already exists receivers in collector which help achieve this.

There are tools that analyzes the logs from database, but i am not sure which tools take care of api level information and tracing information from the query/query plans and processes on it. But yeah you are correct maybe, those should be different project altogether.

srikanthccv commented 2 years ago

Thanks for linking Yuri's comment, Now I understand what&where you meant by bloat. @aabmass what are your thoughts on this?

aabmass commented 2 years ago

@sjs994 raises a good point about SQLCommenter having a larger scope outside of OTel. Take for example the SQLAlchemy instrumentation. It doesn't even need OpenTelemetry present, it can propagate flask info (route) alone which is useful on the database side without a full tracing setup. It also works with OpenCensus span context. It would be a little confusing for a user to install OTel contrib code just to get Flask route sent to their database. @sjs994 do we want to keep these capabilities or limit SQLCommenter to only work with OpenTelemetry present?

OTOH, I would like to see the existing OTel database instrumentations add an option to propagate context with SQLCommenter format. This presents a challenge if we don't absorb the code in this repo into those instrumentations because the user would have double instrument their database engine to get the additional info SQLCommenter provides (e.g. flask route) and OTel span context.

srikanthccv commented 2 years ago

We could enrich it more info using utilities from contrib repos (ex: asgi/wsgi helpers to add route details for every framework instead of just flask). If the both sqlalchemy commenter lib and python sqlalchemy instrumentation try to patch the same def before_cursor_execute it will create issues. I think we should have instrumentation in just one place. We can have non-instrumentation parts in this repo and any additional functionality with larger scope that is instrumentation agnostic will remain here.

sjs994 commented 2 years ago

@aabmass I agree that sqlcommenter don't need opentelemetry, but as @lonewolf3739 made a valid point that its living inside opentelemetry org and logically opentelemetry should be a first party support. I am inclined towards this point.

So, as a start maybe we should aim to make it compliant with opentelemetry only. That will give more traction to the feature/database instrumentation and provide better observability to users. @aabmass what are your thoughts on this ?