GoogleCloudPlatform / cloud-profiler-python

Stackdriver Profiler Python agent is a tool that continuously gathers CPU usage information from Python applications
Apache License 2.0
27 stars 23 forks source link

Wall profile collection error due to line number being None on Python 3.10 #124

Open jason-nance opened 1 year ago

jason-nance commented 1 year ago

Hi, we recently upgraded to Python 3.10 (from Python 3.9) in our service using cloud-profiler-python and are seeing the following error repeatedly in our logs:

INFO - warning: Failed to collect and upload profile whose profile type is WALL: Traceback (most recent call last):
INFO - warning:   File "/home/appuser/.local/lib/python3.10/site-packages/googlecloudprofiler/client.py", line 293, in _collect_and_upload_profile
INFO - warning:     profile_bytes = self._profilers[profile_type].profile(duration_ns)
INFO - warning:   File "/home/appuser/.local/lib/python3.10/site-packages/googlecloudprofiler/pythonprofiler.py", line 148, in profile
INFO - warning:     return self._serialize_and_clear_traces(duration_ns)
INFO - warning:   File "/home/appuser/.local/lib/python3.10/site-packages/googlecloudprofiler/pythonprofiler.py", line 243, in _serialize_and_clear_traces
INFO - warning:     profile_builder.populate_profile(self._traces, self._profile_type,
INFO - warning:   File "/home/appuser/.local/lib/python3.10/site-packages/googlecloudprofiler/builder.py", line 72, in populate_profile
INFO - warning:     location_id = self._location_id(func_id, frame[2])
INFO - warning:   File "/home/appuser/.local/lib/python3.10/site-packages/googlecloudprofiler/builder.py", line 131, in _location_id
INFO - warning:     line.line = line_number

After digging into it, the problem appears to be that the line number in the frame is None, which is described in this issue. The fix appears to have been backported to Python 3.11 but not 3.10, based on the issue comments.

I'm guessing the trace just needs to be discarded somehow if there's no line number but am not sure what the best way to handle this is.

Fortunately, it appears we are still collecting CPU and WALL profiles, but the repeated error messages are annoying.

davidfant commented 12 months ago

I'm experiencing the same issue using Python 3.11

DEFAULT 2023-08-30T10:10:52.860626Z WARNING 2023-08-30 10:10:52,860 Failed to collect and upload profile whose profile type is WALL: Traceback (most recent call last):
DEFAULT 2023-08-30T10:10:52.860646Z File "/usr/local/lib/python3.11/site-packages/googlecloudprofiler/client.py", line 293, in _collect_and_upload_profile
DEFAULT 2023-08-30T10:10:52.860651Z profile_bytes = self._profilers[profile_type].profile(duration_ns)
DEFAULT 2023-08-30T10:10:52.860656Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEFAULT 2023-08-30T10:10:52.860661Z File "/usr/local/lib/python3.11/site-packages/googlecloudprofiler/pythonprofiler.py", line 148, in profile
DEFAULT 2023-08-30T10:10:52.860666Z return self._serialize_and_clear_traces(duration_ns)
DEFAULT 2023-08-30T10:10:52.860670Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEFAULT 2023-08-30T10:10:52.860674Z File "/usr/local/lib/python3.11/site-packages/googlecloudprofiler/pythonprofiler.py", line 243, in _serialize_and_clear_traces
DEFAULT 2023-08-30T10:10:52.860679Z profile_builder.populate_profile(self._traces, self._profile_type,
DEFAULT 2023-08-30T10:10:52.860684Z File "/usr/local/lib/python3.11/site-packages/googlecloudprofiler/builder.py", line 72, in populate_profile
DEFAULT 2023-08-30T10:10:52.860688Z location_id = self._location_id(func_id, frame[2])
DEFAULT 2023-08-30T10:10:52.860692Z ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
DEFAULT 2023-08-30T10:10:52.860696Z File "/usr/local/lib/python3.11/site-packages/googlecloudprofiler/builder.py", line 131, in _location_id
DEFAULT 2023-08-30T10:10:52.860700Z line.line = line_number
DEFAULT 2023-08-30T10:10:52.860705Z ^^^^^^^^^
DEFAULT 2023-08-30T10:10:52.860714Z TypeError: 'NoneType' object cannot be interpreted as an integer
aabmass commented 6 months ago

@davidfant thanks for the report. I'm not sure why we're seeing the issue on 3.11 given it should be fixed upstream. I can certainly update the code to throw away these frames but it might be worth reporting it upstream if it's still happening.

Are you able to reproduce the issue consistently or provide a reproducer?