richardg867 / WaybackProxy

HTTP proxy for tunneling requests through the Internet Archive Wayback Machine
GNU General Public License v3.0
671 stars 56 forks source link

Redirect from http protocol to https protocol #38

Closed Monoloshka closed 2 weeks ago

Monoloshka commented 1 month ago

When visiting the link http://whitehotellangson.vn redirects to https://whitehotellangson.vn and as a result the site does not open although it is in the Internet Archive

Monoloshka commented 1 month ago

The fix also deals with redirects from e.g. http://viuly.live to http://www.viuly.live instead of https://www.viuly.live. Solution to the problem: I've replaced the code:`# Check if the archived URL is different. if archived_dest != archived_url:

Add destination to availability cache and redirect the client.

                        _print('[r]', archived_dest)
                        new_url = '/'.join(split)
                        self.shared_state.availability_cache[archived_dest] = 'http://web.archive.org/web/' + match.group(1) + archived_dest
                        return self.send_redirect_page(http_version, archived_dest, conn.status)` with `if archived_dest != archived_url:
                        # Add destination to availability cache and redirect the client.
                        _print('[r]', archived_dest)
                        new_url = '/'.join(split)
                        print(f'f1:{destination}')
                        http_new_url = new_url.replace("https://", "http://")
                        if http_new_url == availability_url:
                            conn = self.shared_state.http.urlopen('GET', destination, redirect=False, retries=retry, preload_content=False)
                        else:
                            if "http://" in destination:
                                self.shared_state.availability_cache[archived_dest] = 'http://web.archive.org/web/' + match.group(1) + archived_dest
                                return self.send_redirect_page(http_version, archived_dest, conn.status)
                            else:
                                archived_dest = archived_dest.replace("https://", "http://")
                                self.shared_state.availability_cache[archived_dest] = 'http://web.archive.org/web/' + match.group(1) + archived_dest
                                return self.send_redirect_page(http_version, archived_dest, conn.status)`