I am new in this kind of forum and I found a problem to use br.links() which I guess I get the solution.
Where can I publish and discuss this solution ? The problem can solve directly in _sgmllib_copy.py but it is possible a workaround in _html.py
The problem is:
In some HTML pages, the _sgmllib_copy.py suppose some Character references (e.g. ') are in hexadecimal base because it finishes with A-F but it is not because there is no 'x' in the begin.(e.g. Gustaf'Aldo ).
Solution:
To avoid to change the _sgmllib_copy.py, it is possible to change the _html.py in line 315 from:
if name.startswith("x"):
name, base= name[1:], 16
to
if name.startswith("x"):
name, base= name[1:], 16
else:
name = filter(lambda x: x.isdigit(), name)
Hi,
I am new in this kind of forum and I found a problem to use br.links() which I guess I get the solution.
Where can I publish and discuss this solution ? The problem can solve directly in _sgmllib_copy.py but it is possible a workaround in _html.py
The problem is: In some HTML pages, the _sgmllib_copy.py suppose some Character references (e.g. ') are in hexadecimal base because it finishes with A-F but it is not because there is no 'x' in the begin.(e.g. Gustaf'Aldo ).
Solution: To avoid to change the _sgmllib_copy.py, it is possible to change the _html.py in line 315 from: if name.startswith("x"): name, base= name[1:], 16 to if name.startswith("x"): name, base= name[1:], 16 else: name = filter(lambda x: x.isdigit(), name)