apache / skywalking

APM, Application Performance Monitoring System
https://skywalking.apache.org/
Apache License 2.0
23.64k stars 6.49k forks source link

[Bug] skywalking-python not work with uwsgi + flask in master workers mode and threads mode #8566

Closed arcosx closed 2 years ago

arcosx commented 2 years ago

Search before asking

Apache SkyWalking Component

Python Agent (apache/skywalking-python)

What happened

I can't use skywalking under uwsgi and flask in master workers mode and threads mode. I have try solution in https://github.com/apache/skywalking/issues/5263 and https://github.com/apache/skywalking/issues/6324 , but not work for me.

When I use uwsgi in master workers mode and threads mode, the skywalking python sdk does not report the metrics while the program is running, it reports metrics only after I close uwsgi...

image

I think this problem is caused by mixing gevent and uwsgi together.

What you expected to happen

sdk should be able to report metrics in real time...

How to reproduce

Ways

clear Docker env

docker run -t -d python:3.7.12-buster

pip install flask
pip install uwsgi
pip install gevent
pip install "apache-skywalking"
pip install "apache-skywalking[http]"

root@142fcc3e5aeb:~/flask-traceing# pip list
Package            Version
------------------ ---------
apache-skywalking  0.7.0
certifi            2021.10.8
charset-normalizer 2.0.12
click              8.0.4
Flask              2.0.3
gevent             21.12.0
greenlet           1.1.2
grpcio             1.44.0
grpcio-tools       1.44.0
idna               3.3
importlib-metadata 4.11.1
itsdangerous       2.1.0
Jinja2             3.0.3
MarkupSafe         2.1.0
packaging          21.3
pip                21.2.4
protobuf           3.19.4
pyparsing          3.0.7
requests           2.27.1
setuptools         57.5.0
six                1.16.0
typing_extensions  4.1.1
urllib3            1.26.8
uWSGI              2.0.20
Werkzeug           2.0.3
wheel              0.37.0
wrapt              1.13.3
zipp               3.7.0
zope.event         4.5.0
zope.interface     5.4.0

code 1

from gevent import monkey
monkey.patch_all()

import grpc.experimental.gevent as grpc_gevent
grpc_gevent.init_gevent()

from skywalking import agent, config

# grpc version
config.protocol = 'grpc'
config.init(collector_address='xxxxx', service_name='test-1')
config.flask_collect_http_params = True
config.logging_level = 'DEBUG'
agent.start()

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    app.run()

code 2

from skywalking import agent, config

# grpc version
config.protocol = 'grpc'
config.init(collector_address='xxx', service_name='test-1')
config.flask_collect_http_params = True
config.logging_level = 'DEBUG'
agent.start()

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    app.run()

uwsgi run mode 1

uwsgi --die-on-term \
    --http 0.0.0.0:5000 \
    --http-manage-expect \
    --master --workers 3 \
    --enable-threads \
    --threads 3 \
    --manage-script-name \
    --mount /=main:app

uwsgi run mode 2

uwsgi --die-on-term \
    --http 0.0.0.0:5000 \
    --http-manage-expect \
    --manage-script-name \
    --mount /=main:app

Result

Not use uwsgi Work OK

python main.py
....
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [b31a30ec931111ecb19c0242ac110004]
 * Serving Flask app 'main' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
127.0.0.1 - - [21/Feb/2022 12:28:02] "GET / HTTP/1.1" 200 -
skywalking [Thread-11] [DEBUG] reporting segment Segment@140325469202192[segment_id=b628b966931111ecbaf00242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7fa011bea3d0>], timestamp=1645446482149, related_traces=[<skywalking.trace.segment._NewID object at 0x7fa011bea850>]]
127.0.0.1 - - [21/Feb/2022 12:28:03] "GET / HTTP/1.1" 200 -
skywalking [Thread-14] [DEBUG] reporting segment Segment@140325443643856[segment_id=b73bda68931111ecaf140242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7fa01038a710>], timestamp=1645446483952, related_traces=[<skywalking.trace.segment._NewID object at 0x7fa01038a8d0>]]

code 1 + uwsgi run mode 1 Not OK

*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 7200)
E0221 12:19:43.643491357    7200 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 1 (pid: 7211, cores: 3)
E0221 12:19:43.646077330    7200 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 2 (pid: 7214, cores: 3)
E0221 12:19:43.648015448    7200 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 3 (pid: 7217, cores: 3)
E0221 12:19:43.649466988    7200 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI http 1 (pid: 7220)
[pid: 7214|app: 0|req: 1/1] 127.0.0.1 () {28 vars in 293 bytes} [Mon Feb 21 12:19:49 2022] GET / => generated 12 bytes in 7 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
[pid: 7217|app: 0|req: 1/2] 127.0.0.1 () {28 vars in 294 bytes} [Mon Feb 21 12:19:50 2022] GET / => generated 12 bytes in 7 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
^CSIGINT/SIGTERM received...killing workers...
gateway "uWSGI http 1" has been buried (pid: 7220)
skywalking [Thread-3] [DEBUG] reporting segment Segment@140155123891344[segment_id=90aab898931011ec89130242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f78685f7d90>], timestamp=1645445989751, related_traces=[<skywalking.trace.segment._NewID object at 0x7f78685f7bd0>]]
skywalking [Thread-3] [DEBUG] reporting segment Segment@140155123891344[segment_id=91560ef0931011ec89130242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f78685f7d90>], timestamp=1645445990874, related_traces=[<skywalking.trace.segment._NewID object at 0x7f78685f7bd0>]]
skywalking [Thread-8] [DEBUG] grpc channel connectivity changed, [ChannelConnectivity.CONNECTING -> ChannelConnectivity.READY]
skywalking [Thread-8] [DEBUG] grpc channel connectivity changed, [ChannelConnectivity.CONNECTING -> ChannelConnectivity.READY]
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [8ce135b6931011ec89130242ac110004]
skywalking [Thread-8] [DEBUG] grpc channel connectivity changed, [ChannelConnectivity.CONNECTING -> ChannelConnectivity.READY]
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [8ce135b6931011ec89130242ac110004]
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [8ce135b6931011ec89130242ac110004]
worker 1 buried after 1 seconds
worker 2 buried after 1 seconds
worker 3 buried after 1 seconds
goodbye to uWSGI.

code 1+ uwsgi run mode 2 NOT OK

WSGI app 0 (mountpoint='/') ready in 0 seconds on interpreter 0x5594e7fc3ce0 pid: 8531 (default app)
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI worker 1 (and the only) (pid: 8531, cores: 1)
[pid: 8531|app: 0|req: 1/1] 127.0.0.1 () {28 vars in 293 bytes} [Mon Feb 21 12:22:07 2022] GET / => generated 12 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
[pid: 8531|app: 0|req: 2/2] 127.0.0.1 () {28 vars in 293 bytes} [Mon Feb 21 12:22:08 2022] GET / => generated 12 bytes in 1 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
^Cskywalking [Thread-3] [DEBUG] reporting segment Segment@139745731000592[segment_id=e2ff80ec931011ecb2ce0242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f1916a7bb10>], timestamp=1645446127879, related_traces=[<skywalking.trace.segment._NewID object at 0x7f1916a7bad0>]]
skywalking [Thread-5] [DEBUG] reporting segment Segment@139745731000848[segment_id=e3990ca8931011ecb2ce0242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f1916a7bf10>], timestamp=1645446128886, related_traces=[<skywalking.trace.segment._NewID object at 0x7f1916a7bed0>]]
skywalking [Thread-6] [DEBUG] grpc channel connectivity changed, [ChannelConnectivity.CONNECTING -> ChannelConnectivity.READY]
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [e122606e931011ecb2ce0242ac110004]

code 2+ uwsgi run mode 1 NOT OK

hang when request ...

root@sssss:~/flask-traceing# curl -v 127.0.0.1:5000
* Expire in 0 ms for 6 (transfer 0x55f6bd4f1fb0)
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55f6bd4f1fb0)
* Connected to 127.0.0.1 (127.0.0.1) port 5000 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:5000
> User-Agent: curl/7.64.0
> Accept: */*
> 
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 9801)
E0221 12:23:54.949025574    9801 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 1 (pid: 9817, cores: 3)
E0221 12:23:54.951940621    9801 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 2 (pid: 9819, cores: 3)
E0221 12:23:54.954821063    9801 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI worker 3 (pid: 9820, cores: 3)
E0221 12:23:54.957141004    9801 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
spawned uWSGI http 1 (pid: 9821)
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [2299ccf8931111ecb7390242ac110004]
skywalking [QueryProfileCommandThread] [DEBUG] query profile commands
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [2299ccf8931111ecb7390242ac110004]
skywalking [QueryProfileCommandThread] [DEBUG] query profile commands
skywalking [QueryProfileCommandThread] [DEBUG] query profile commands
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [2299ccf8931111ecb7390242ac110004]
^CSIGINT/SIGTERM received...killing workers...
gateway "uWSGI http 1" has been buried (pid: 9821)
skywalking [QueryProfileCommandThread] [DEBUG] query profile commands
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [2299ccf8931111ecb7390242ac110004]
skywalking [QueryProfileCommandThread] [DEBUG] query profile commands
^C

code 2+ uwsgi run mode 2 NOT OK

WSGI app 0 (mountpoint='/') ready in 1 seconds on interpreter 0x55d66f587ce0 pid: 9274 (default app)
uWSGI running as root, you can use --uid/--gid/--chroot options
*** WARNING: you are running uWSGI as root !!! (use the --uid flag) *** 
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI worker 1 (and the only) (pid: 9274, cores: 1)
[pid: 9274|app: 0|req: 1/1] 127.0.0.1 () {28 vars in 294 bytes} [Mon Feb 21 12:23:12 2022] GET / => generated 12 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
skywalking [HeartbeatThread] [DEBUG] service heart beats, [test-1], [0761d84a931111eca1990242ac110004]
[pid: 9274|app: 0|req: 2/2] 127.0.0.1 () {28 vars in 294 bytes} [Mon Feb 21 12:23:12 2022] GET / => generated 12 bytes in 2 msecs (HTTP/1.1 200) 2 headers in 79 bytes (1 switches on core 0)
^Cskywalking [Thread-7] [DEBUG] reporting segment Segment@139889892138320[segment_id=093abc54931111eca1990242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f3aa7541c50>], timestamp=1645446192021, related_traces=[<skywalking.trace.segment._NewID object at 0x7f3aa7541cd0>]]
skywalking [Thread-8] [DEBUG] reporting segment Segment@139889841328976[segment_id=09cb0a02931111eca1990242ac110004, spans=[<skywalking.trace.span.EntrySpan object at 0x7f3aa44cd550>], timestamp=1645446192967, related_traces=[<skywalking.trace.segment._NewID object at 0x7f3aa44cd510>]]

Anything else

No response

Are you willing to submit PR?

Code of Conduct

Superskyyy commented 2 years ago

Hello, this issue seems still deeply related to the two previous issues that you linked, I appreciate it if you could allow some time for further investigation. And thanks for the detailed case trial. @arcosx

arcosx commented 2 years ago

Hello, this issue seems still deeply related to the two previous issues that you linked, I appreciate it if you could allow some time for further investigation. And thanks for the detailed case trial. @arcosx

OK, waiting for good news😁. If there is something I need to cooperate with you can reply.

arcosx commented 2 years ago

Hi @kezhenxu94 @Superskyyy , after some research, I don't think skywalking-python has any problem, it's uWSGI that needs special treatment, this solution below can solve my problem, including issues https://github.com/apache/skywalking/issues/6324.

from uwsgidecorators import postfork
from skywalking import agent, config

@postfork
def init_tracing():
    config.protocol = 'grpc'
    config.init(collector_address='xxxxx', service_name='test-1')
    config.flask_collect_http_params = True
    config.logging_level = 'DEBUG'
    agent.start()

from flask import Flask
app = Flask(__name__)

@app.route('/')
def hello_world():
    return 'Hello World!'

if __name__ == '__main__':
    app.run()

From uWSGI doc we can know

uWSGI is a preforking (or “fork-abusing”) server, so you might need to execute a fixup task after each fork(). The postfork decorator is just the ticket. You can declare multiple postfork tasks. Each decorated function will be executed in sequence after each fork().

Do you have a better solution? Maybe we can note this in skywalking's documentation, after all uWsgi is a popular library.

Superskyyy commented 2 years ago

@arcosx Hello, great to see you found a solution, would you mind opening a PR to document this? I believe it is valuable since both packages are widely used, and from our observation, the issue occurs rather frequently. Probably this should go to the FAQ folder, with a doc name of your choice. https://github.com/apache/skywalking-python/tree/master/docs/en/setup/faq

kezhenxu94 commented 2 years ago

@arcosx thanks for the sharing, I'll modify the label to documentation if you'd like to open a PR to add that into our doc.

arcosx commented 2 years ago

@arcosx Hello, great to see you found a solution, would you mind opening a PR to document this? I believe it is valuable since both packages are widely used, and from our observation, the issue occurs rather frequently. Probably this should go to the FAQ folder, with a doc name of your choice. https://github.com/apache/skywalking-python/tree/master/docs/en/setup/faq

I think adding to the FAQ is a good way 😃 and I will submit a PR.

wu-sheng commented 2 years ago

Notice, if FAQ menu doesn't exist, you need to update menu yaml file to make this available on the website.

wu-sheng commented 2 years ago

To be specific, this file https://github.com/apache/skywalking-python/blob/master/docs/menu.yml#L43.

wu-sheng commented 2 years ago

@arcosx Any update?

arcosx commented 2 years ago

@arcosx Any update?

Sorry for the late submit, the PR here https://github.com/apache/skywalking-python/pull/188

pacific28 commented 2 years ago

Hi @Superskyyy I'm trying to run a Django application with gunicorn. It is unable to start the agent. I followed this thread and https://opentelemetry-python.readthedocs.io/en/latest/examples/fork-process-model/README.html

Here is my wsgi.py

import os

from django.core.wsgi import get_wsgi_application

from uwsgidecorators import postfork
from skywalking import agent, config

def post_fork(server, worker):
    print("abc")
    config.init(collector_address='xxxxx', service_name='test-1')
    agent.start()

os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.settings")

application = get_wsgi_application()

this is the command to start the application gunicorn myproject.wsgi --workers 2

Any suggestions.

Superskyyy commented 2 years ago

@pacific28 I think your issue originates from that you haven't registered the post_fork hook in the gunicorn startup command.

try something like this gunicorn -c path_to_your_hook_config myproject.wsgi --workers 2 Take a look here for an complete example https://github.com/benoitc/gunicorn/blob/master/examples/example_config.py

Actually we have our own docs for using with post-fork servers now, feel free to enhance it as it didn't explicitly mention gunicorn.

https://skywalking.apache.org/docs/skywalking-python/latest/en/setup/faq/how-to-use-with-uwsgi/

pacific28 commented 2 years ago

@Superskyyy Thanks for your response. Tried with config file and it worked.

JvvYou commented 1 year ago

I have a similar problem. wsgi.py: `import os

from django.core.wsgi import get_wsgi_application from skywalking import agent, config

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'self-healing.settings_prod')

config.init(collector_address='xxxx:12800', service_name='self-healing', protocol='http') agent.start()

`

start command:

gunicorn self-healing.wsgi:application -b0.0.0.0:8000 -w1 --access-logfile=/tmp/access.log

skywalking web: `Traceback: Traceback (most recent call last): File "/root/miniconda3/envs/self-healing/lib/python3.7/site-packages/django/contrib/sessions/backends/base.py", line 233, in _get_session return self._session_cache AttributeError: 'SessionStore' object has no attribute '_session_cache'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/root/miniconda3/envs/self-healing/lib/python3.7/site-packages/skywalking/plugins/sw_mysqlclient.py", line 70, in execute return self._self_cur.execute(query, args) File "/root/miniconda3/envs/self-healing/lib/python3.7/site-packages/skywalking/plugins/sw_pymysql.py", line 49, in _sw_execute span.tag(TagDbInstance((this.connection.db or b'').decode('utf-8'))) AttributeError: 'str' object has no attribute 'decode'`

Thanks for your help

Superskyyy commented 1 year ago

Hello this is not the same problem related to forking. It seems like its not a problem with skywalking agent but with Django?

JvvYou commented 1 year ago

Hello this is not the same problem related to forking. It seems like its not a problem with skywalking agent but with Django?

I can access it normally through python manage.py run without reporting an error, but there will be a problem through gunicorn. Finally, I refer to https://skywalking.apache.org/docs/skywalking-python/next/en/setup/faq/how-to-use-with-uwsgi/, and the same problem occurs through uwsgi.

Is there any way to find out what the problem is?

Superskyyy commented 1 year ago

Hello this is not the same problem related to forking. It seems like its not a problem with skywalking agent but with Django?

I can access it normally through python manage.py run without reporting an error, but there will be a problem through gunicorn. Finally, I refer to skywalking.apache.org/docs/skywalking-python/next/en/setup/faq/how-to-use-with-uwsgi, and the same problem occurs through uwsgi.

Is there any way to find out what the problem is?

Is everything fine when you use uwsgi/gunicorn Django without enabling skywalking agent? Thanks.

There could be bug in the plugins as they cannot be comprehensively tested. In the case of

span.tag(TagDbInstance((this.connection.db or b'').decode('utf-8')))
AttributeError: 'str' object has no attribute 'decode'`

Probably the plugin was outdated and it should be without the .decode('utf-8') part, can you remove it and see if everything works>? @weiwei-9527

JvvYou commented 1 year ago

@Superskyyy It is normal without enabling the agent. When I start with gunicorn, the debugging code and execution results are as follows:

        span.tag(TagDbType('mysql'))
        print(this.connection.db)
        print(type(this.connection.db))
        #span.tag(TagDbInstance((this.connection.db or b'').decode('utf-8')))
        span.tag(TagDbStatement(query))

result: b'self-healing' <class 'bytes'> self-healing <class 'str'> self-healing <class 'str'> self-healing <class 'str'> self-healing <class 'str'> self-healing <class 'str'> self-healing <class 'str'>

JvvYou commented 1 year ago

I finally referred to the solution of # 7457, specified post_fork in gunicorn, and encountered a similar # 7457 problem.

Superskyyy commented 1 year ago

@Beancurd-vv @pacific28 @WinterWaterWarm FYI 1.0.0 is voting soon, prefork server support is added and can be automatically injected without postfork hooks.