open-telemetry / opentelemetry-python-contrib

OpenTelemetry instrumentation for Python modules
https://opentelemetry.io
Apache License 2.0
735 stars 607 forks source link

Vote on which instrumentations to implement the migration plan! #2351

Open lzchen opened 8 months ago

lzchen commented 8 months ago

Based off of the recent discussion on alleviating some of the implementation burden of implementing the migration plan for instrumentations moving to new semantic conventions, we are now allowed to choose a subset of instrumentations to implement this for. Please comment here on the preferences for "important instrumentations" that we should include as part of the migration plan. Keep in mind non-http instrumentations are included because this affects their eventual migration. A couple of things to also consider:

  1. We can iterate on this list, where we can INCLUDE instrumentations in this list as we see fit but we most likely do not want to REMOVE ones that we already decide to be important.
  2. Once we decide on a preliminary set of instrumentations, we can begin implementing migration plans for them INDIVIDUALLY.
  3. For instrumentations NOT included in the set, we can begin to discuss whether we want to allow for contributions that update PARTIALLY to new semantic conventions.

For starters, I vote:

opentelemetry-instrumentation-aiohttp-client opentelemetry-instrumentation-asgi opentelemetry-instrumentation-django opentelemetry-instrumentation-fastapi opentelemetry-instrumentation-httplib opentelemetry-instrumentation-httpx opentelemetry-instrumentation-flask opentelemetry-instrumentation-psycopg2 opentelemetry-instrumentation-requests opentelemetry-instrumentation-sqlalchemy opentelemetry-instrumentation-requests opentelemetry-instrumentation-urllib opentelemetry-instrumentation-urllib3 opentelemetry-instrumentation-wsgi

xrmx commented 8 months ago

Maybe opentelemetry-instrumentation-dbapi?

aabmass commented 8 months ago

opentelemetry-instrumentation-sqlalchemy Maybe opentelemetry-instrumentation-dbapi?

Wouldn't this only affect HTTP libraries right now?

samuelcolvin commented 8 months ago

In case it helps, here are the download counts for most of the libraries for February.

I would suggest these download counts should be used as the primary indicator of which projects are most popular (obviously that won't work for instrumentation of internal standard lib packages like asyncio or sqlite3).

project downloads
urllib3 471792853
requests 392426582
jinja2 151665370
sqlalchemy 104640315
aiohttp 102008929
flask 97099707
httpx 46978770
fastapi 45061219
redis 39668426
starlette 29139254
pymongo 24734312
psycopg2 15699913
django 14132946
celery 9053682
psycopg 5138183
asyncpg 3879504
pymemcache 1314158
falcon 1103115
BigQuery SQL ```sql SELECT file.project, COUNT(*) FROM `bigquery-public-data.pypi.file_downloads` WHERE DATE_TRUNC(DATE(timestamp), MONTH) = '2024-02-01' AND file.project IN ('asyncpg', 'aiohttp', 'django', 'fastapi', 'starlette', 'httpx', 'flask', 'falcon', 'psycopg', 'psycopg2', 'psycopg3', 'requests', 'sqlalchemy', 'requests', 'urllib3', 'jinja2', 'pymongo', 'redis', 'pymemcache', 'celery') group by file.project ```
samuelcolvin commented 8 months ago

In particular, Jinja2 has a lot of downloads and is missing from your list above @lzchen.

lzchen commented 8 months ago

@aabmass

opentelemetry-instrumentation-sqlalchemy Maybe opentelemetry-instrumentation-dbapi?

Wouldn't this only affect HTTP libraries right now?

Keep in mind non-http instrumentations are included because this affects their eventual migration.

jeremydvoss commented 7 months ago

In case it helps, here are the download counts for most of the libraries for February.

Just a note, I think this is total downloads, the picture appears quite different when you look at, say, weekly stats

samuelcolvin commented 7 months ago

As per my message, and the SQL, this is downloads for February this year. Weekly won't look very different.

ocelotl commented 7 months ago

Jinja2 seems important based on the download numbers, I say we add it too.

lzchen commented 7 months ago

Alright, as discussed in the 04/11/2024 Python SIG, the original list that is finalized is:

opentelemetry-instrumentation-aiohttp-client opentelemetry-instrumentation-asgi opentelemetry-instrumentation-django opentelemetry-instrumentation-fastapi opentelemetry-instrumentation-httplib opentelemetry-instrumentation-httpx opentelemetry-instrumentation-flask opentelemetry-instrumentation-jinja2 opentelemetry-instrumentation-psycopg2 opentelemetry-instrumentation-requests opentelemetry-instrumentation-sqlalchemy opentelemetry-instrumentation-requests opentelemetry-instrumentation-urllib opentelemetry-instrumentation-urllib3 opentelemetry-instrumentation-wsgi

I will be formalizing this in our README. We will add to this list accordingly if there is additional interest in the future for other instrumentations.

lzchen commented 3 weeks ago

Reopening this to more accurately address popular db instrumentations now that db sem convs are nearing stability. Our current votes are for psycopg2 and sqlalchemy. Seeing as dbapi is quite prominent in most instrumentations, we should probably include that as well. Going to have this forum open for a week or so if anyone has any other opinions.

tammy-baylis-swi commented 3 weeks ago

Based on how they're implemented, I think a migration of dbapi would automatically migrate the below because of how they each call dbapi's wrap_connect and/or instrument_connection for span name/attribute setting. I vote for dbapi and therefore these for covering Postgresql, Mysql, and sqlite3:

opentelemetry-instrumentation-mysql opentelemetry-instrumentation-mysqlclient opentelemetry-instrumentation-psycopg opentelemetry-instrumentation-psycopg2 opentelemetry-instrumentation-pymysql opentelemetry-instrumentation-sqlite3

I also vote for sqlalchemy. It would give ORM coverage while the above are for db drivers.

There are other instrumentors for async db client components(Tortoise ORM, the async/aio drivers), NoSQL (pymongo, cassandra). I think some are popular. If someone has BigQuery access like previous comment it would be great to see some numbers.

lzchen commented 3 weeks ago

Related https://github.com/open-telemetry/opentelemetry-python-contrib/issues/2885

Since redis is not in this list, we can simply accept new contributions for new sem conv.

emdneto commented 2 weeks ago
Project Total
sqlalchemy 321693518
redis 113037620
pymysql 95782647
pymongo 70771804
elasticsearch 70070260
mysql-connector-python 60233705
psycopg2 31599930
mysqlclient 22651979
psycopg 20459318
asyncpg 20423688
cassandra-driver 6745357
pymemcache 4036171
aiopg 1058049
tortoise-orm 547206
scylla-driver 421332
SELECT
  file.project,
  COUNT(*) as total
FROM `bigquery-public-data.pypi.file_downloads`
WHERE file.project IN ('mysql-connector-python', 'mysqlclient', 'psycopg', 'psycopg2', 'pymysql', 'aiopg', 'asyncpg','sqlalchemy','pymongo', 'redis', 'pymemcache', 'cassandra-driver', 'scylla-driver', 'elasticsearch', 'tortoise-orm') AND DATE(timestamp)
    BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 60 DAY)
    AND CURRENT_DATE()
group by file.project
order by total DESC
emdneto commented 2 weeks ago

Discussed in SIG 10/24 about including redis as well, and based on the downloads counts we agreed to include it to the list:

opentelemetry-instrumentation-dbapi opentelemetry-instrumentation-sqlalchemy opentelemetry-instrumentation-redis opentelemetry-instrumentation-mysql opentelemetry-instrumentation-mysqlclient opentelemetry-instrumentation-psycopg opentelemetry-instrumentation-psycopg2 opentelemetry-instrumentation-pymysql opentelemetry-instrumentation-sqlite3

tammy-baylis-swi commented 2 weeks ago

Thanks for query and discussion result in my absence! I've created new issues and linked them in OP of https://github.com/open-telemetry/opentelemetry-python-contrib/issues/2453