jayvdb / pypidb

PyPI client side database with SCM/VCS URLs
Apache License 2.0
13 stars 3 forks source link

wiki.ros.org redirects to status.ros.org #115

Closed jayvdb closed 4 years ago

jayvdb commented 4 years ago

http://wiki.ros.org/catkin_pkg in firefox loads normally, and the SCM is found on the page.

However in pypidb, it is redirected to https://status.ros.org/

Breaking catkin_pkg & rospkg & rosinstall (via fedora dataset) & rosinstall-generator & wstool and likely other ROS packages.

INFO     pypidb._pypi:_pypi.py:450 r http://wiki.ros.org/catkin_pkg
DEBUG    pypidb._adapters:_adapters.py:84 cdn block of http://wiki.ros.org/catkin_pkg skipped
DEBUG    pypidb._adapters:_adapters.py:108 domain block of http://wiki.ros.org/catkin_pkg skipped
DEBUG    pypidb._adapters:_adapters.py:32 is_num = False; is_IP = False: http://wiki.ros.org/catkin_pkg
DEBUG    pypidb._adapters:_adapters.py:42 is_IP = False: http://wiki.ros.org/catkin_pkg
DEBUG    pypidb._adapters:_adapters.py:46 IPblock of http://wiki.ros.org/catkin_pkg skipped
DEBUG    urllib3.connectionpool:connectionpool.py:226 Starting new HTTP connection (1): wiki.ros.org:80
DEBUG    urllib3.connectionpool:connectionpool.py:433 http://wiki.ros.org:80 "HEAD /catkin_pkg HTTP/1.1" 302 0
DEBUG    pypidb._adapters:_adapters.py:271 head http://wiki.ros.org/catkin_pkg http://wiki.ros.org/catkin_pkg <Response [302]> {'Date': 'Mon, 20 Apr 2020 19:58:13 GMT', 'Server': 'Apache', 'Location': 'https://status.ros.org/', 'Content-Type': 'text/html; charset=iso-8859-1', 'Connection': 'Keep-Alive', 'Content-Length': '0'} b''
INFO     https_everywhere.adapter:adapter.py:94 adapter responding to http://wiki.ros.org/catkin_pkg with http://wiki.ros.org/catkin_pkg: {'Date': 'Mon, 20 Apr 2020 19:58:13 GMT', 'Server': 'Apache', 'Location': 'https://status.ros.org/', 'Content-Type': 'text/html; charset=iso-8859-1', 'Connection': 'Keep-Alive', 'Content-Length': '0'}
jayvdb commented 4 years ago

Even with the following patch, the same redirect occurs

diff --git a/pypidb/_pypi.py b/pypidb/_pypi.py
index c8de722..77142b6 100644
--- a/pypidb/_pypi.py
+++ b/pypidb/_pypi.py
@@ -57,12 +57,13 @@ social_definitions.update(
 )

 _DEFAULT_HEADERS = {
-    "User-Agent": UserAgent().google,
-    "Accept": "text/html,text/plain,application/*;q=0.8,text/*;q=0.5",
+    "User-Agent": UserAgent().firefox,
+    "Accept": "text/html,text/plain,application/xhtml+xml;q=0.9,application/xml;q=0.9,application/*;q=0.8,text/*;q=0.5,*/*;q=0.4",
     "Accept-Encoding": "br,gzip;q=0.9,deflate;q=0.8"
     if brotli
     else "gzip,deflate;q=0.9",
     "Accept-Language": "en-US,en;q=0.9",
+    "Upgrade-Insecure-Requests": "1",
 }

The problem may be in https-everywhere-py, which may be dropped those headers during its internal logic which is doing a HEAD before the GET.

jayvdb commented 4 years ago

Also causes another failure: "bloom": "https://github.com/ros/catkin"

jayvdb commented 4 years ago

Oddly, now in Chrome & Firefox, http://wiki.ros.org/catkin_pkg redirects to https://status.ros.org/ , but https://wiki.ros.org/catkin_pkg does not. This seems to be a bug upstream.

jayvdb commented 4 years ago

Created a topic at https://discourse.ros.org/, but it has gone into moderation.

jayvdb commented 4 years ago

It appears to be fixed. Without any notice on https://status.ros.org/ to indicate there was a failure.

jayvdb commented 4 years ago

Topic https://discourse.ros.org/t/wiki-ros-org-redirects-to-status-ros-org/13758 provides an explanation for the temporary failures.