Open JonoYang opened 2 days ago
The purldb webserver experienced an error that killed the gunicorn worker handling this request:
web-1 | [2024-07-02 21:31:42 +0000] [9] [CRITICAL] WORKER TIMEOUT (pid:10) web-1 | [2024-07-02 21:31:42 +0000] [10] [ERROR] Error handling request /api/collect/index_packages/ web-1 | Traceback (most recent call last): web-1 | File "/usr/local/lib/python3.11/site-packages/gunicorn/workers/sync.py", line 135, in handle web-1 | self.handle_request(listener, req, client, addr) web-1 | File "/usr/local/lib/python3.11/site-packages/gunicorn/workers/sync.py", line 178, in handle_request web-1 | respiter = self.wsgi(environ, resp.start_response) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/wsgi.py", line 124, in __call__ web-1 | response = self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/base.py", line 140, in get_response web-1 | response = self._middleware_chain(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/utils/deprecation.py", line 134, in __call__ web-1 | response = response or self.get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner web-1 | response = get_response(request) web-1 | ^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response web-1 | response = wrapped_callback(request, *callback_args, **callback_kwargs) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/contextlib.py", line 81, in inner web-1 | return func(*args, **kwds) web-1 | ^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/django/views/decorators/csrf.py", line 65, in _view_wrapper web-1 | return view_func(request, *args, **kwargs) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/rest_framework/viewsets.py", line 124, in view web-1 | return self.dispatch(request, *args, **kwargs) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/rest_framework/views.py", line 506, in dispatch web-1 | response = handler(request, *args, **kwargs) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/app/packagedb/api.py", line 973, in index_packages web-1 | get_source_package_and_add_to_package_set(package) web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 141, in get_source_package_and_add_to_package_set web-1 | source_purl = get_source_repo(package=package) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 198, in get_source_repo web-1 | repo_urls = list(get_repo_urls(package)) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 225, in get_repo_urls web-1 | source_urls = get_source_urls_from_package_data_and_resources( web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 244, in get_source_urls_from_package_data_and_resources web-1 | metadata_urls = list(get_urls_from_package_data(package)) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 345, in get_urls_from_package_data web-1 | found_urls.extend(get_urls_from_text(text=homepage_text)) web-1 | File "/usr/local/lib/python3.11/site-packages/purl2vcs/find_source_repo.py", line 36, in get_urls_from_text web-1 | for url in get_urls_from_location(location=lines)["urls"]: web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/scancode/api.py", line 134, in get_urls web-1 | for urls, line_num in found_urls: web-1 | File "/usr/local/lib/python3.11/site-packages/scancode/api.py", line 130, in <genexpr> web-1 | found_urls = ((u, ln) for (u, ln) in find_urls(location) if u) web-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 257, in find_urls web-1 | for _key, url, _line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 78, in unique_filter web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 576, in junk_urls_filter web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 553, in junk_url_hosts_filter web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 425, in canonical_url_cleaner web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 108, in re_filt web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 360, in user_pass_cleaning_filter web-1 | for key, match, line, line_number in matches: web-1 | File "/usr/local/lib/python3.11/site-packages/cluecode/finder.py", line 336, in scheme_adder web-1 | yield key, match, line, line_number web-1 | File "/usr/local/lib/python3.11/site-packages/gunicorn/workers/base.py", line 203, in handle_abort web-1 | sys.exit(1) web-1 | SystemExit: 1 web-1 | [2024-07-02 21:31:42 +0000] [10] [INFO] Worker exiting (pid: 10) web-1 | [2024-07-02 21:31:43 +0000] [9] [ERROR] Worker (pid:10) was sent SIGKILL! Perhaps out of memory?
My initial guess is that there may be a regex explosion happening when parsing urls
The purldb webserver experienced an error that killed the gunicorn worker handling this request:
My initial guess is that there may be a regex explosion happening when parsing urls