internetarchive / liveweb

Liveweb proxy of the Wayback Machine project
https://web.archive.org/
44 stars 13 forks source link

canonicalize dns lookups #24

Closed rajbot closed 12 years ago

rajbot commented 12 years ago

because there is a search domain in the dns resolver, handle things like http://www/foo.

Sam says:

when we do a dns lookup, we should look up "www." (with a dot at the end), so we don't resolve to www.archive.org due to search domain.

but when we send the host header, it should not have the added dot at the end.

also, don't add a dot if there is already one there.

anandology commented 12 years ago

I'm not sure what you mean by "there is a search domain in the dns resolver".

Can you please give an example?

anandology commented 12 years ago

Adding a dot at the end doesn't work with IP addresses.

>>> import socket
>>> socket.gethostbyname("127.0.0.1.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
socket.gaierror: [Errno -2] Name or service not known
anandology commented 12 years ago

The right fix is to set environment variable LOCALDOMAIN to disable search domains.

LOCALDOMAIN="-"
anandology commented 12 years ago
$ LOCALDOMAIN=- curl http://www/
curl: (6) Couldn't resolve host 'www'

$ LOCALDOMAIN=google.com curl http://www/
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>