I'll prefer to load PSL into a CDB data source, hence I'll use below Python snippet to generate a CDB input source file. Hopefully this is faster than doing looping over Lua large hash table searches :)
The CDB data source is created by running these cmds (from a makefile):
the 'psl' file (from http://publicsuffixlist.org) and psl2cdb.py are distributed through ecconfigd and kept under svn, each MTA nodes then runs make every 5 min.
psl2cdb.py:
--- Python snippet ---
import sys
from encodings.idna import ToASCII
-- read lines from stdin
for line in sys.stdin:
if line[0:2] != "//" and line[:-1] != "":
if line[0:1]=="!":
domain=line[1:-1].decode('stringescape')
val="!" # indicate negation
elif line[0:2]==".":
domain=line[2:-1].decode('stringescape')
val="" # indicate wildcard
else:
domain=line[0:-1].decode('string_escape')
val="t" # indicate TLD :)
domain = domain.decode('UTF-8')
compd = domain.split(".")
newline=""
for atom in compd:
newline=newline+ToASCII(atom)+"."
kl = len(newline[:-1])
dl = len(val)
using python 2.4 output formatting
print '+%d,%d:%s->%s' % (kl,dl,newline[:-1],val)
at final EOF \n
print ''
--- end of snippet
Remember to add CDB source to your ecelerity.conf with wanted cache size, TTL & path:
I'll prefer to load PSL into a CDB data source, hence I'll use below Python snippet to generate a CDB input source file. Hopefully this is faster than doing looping over Lua large hash table searches :)
The CDB data source is created by running these cmds (from a makefile):
${path_cdbtables}/psl.cdb: psl @cat psl | ${python} psl2cdb.py | ${plocal}/cdbmake ${path_cdbtables}/psl.cdb ${path_cdbtables}/tmp.$$$$ @ls -lrt ${path_cdbtables}/
the 'psl' file (from http://publicsuffixlist.org) and psl2cdb.py are distributed through ecconfigd and kept under svn, each MTA nodes then runs make every 5 min.
psl2cdb.py:
--- Python snippet ---
import sys from encodings.idna import ToASCII
-- read lines from stdin
for line in sys.stdin: if line[0:2] != "//" and line[:-1] != "": if line[0:1]=="!": domain=line[1:-1].decode('stringescape') val="!" # indicate negation elif line[0:2]==".": domain=line[2:-1].decode('stringescape') val="" # indicate wildcard else: domain=line[0:-1].decode('string_escape') val="t" # indicate TLD :) domain = domain.decode('UTF-8') compd = domain.split(".") newline="" for atom in compd: newline=newline+ToASCII(atom)+"." kl = len(newline[:-1]) dl = len(val)
using python 2.4 output formatting
at final EOF \n
print ''
--- end of snippet
Remember to add CDB source to your ecelerity.conf with wanted cache size, TTL & path:
Datasource "publicsuffixlist" { cache_size = "8192" cache_life = "1800" uri = ( "cdb://psl.cdb" )
}