Closed rock321987 closed 6 years ago
@rock321987 I've tried to replicate the error using Docker ubuntu:18.04
image:
root@a3506a595f72:~# uname -ar
Linux a3506a595f72 4.14.33+ #1 SMP Sat Aug 11 08:05:16 PDT 2018 x86_64 x86_64 x86_64 GNU/Linux
One difference is that the import entrypoint is pattern
and not pattern3
:
root@a3506a595f72:~# pip3 install pattern
Requirement already satisfied: pattern in /usr/local/lib/python3.6/dist-packages
Requirement already satisfied: backports.csv in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: feedparser in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: lxml in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: pdfminer.six in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: python-docx in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: mysqlclient in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: nltk in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: cherrypy in /usr/local/lib/python3.6/dist-packages (from pattern)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->pattern)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->pattern)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->pattern)
Requirement already satisfied: idna<2.8,>=2.5 in /usr/lib/python3/dist-packages (from requests->pattern)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from pdfminer.six->pattern)
Requirement already satisfied: pycryptodome in /usr/local/lib/python3.6/dist-packages (from pdfminer.six->pattern)
Requirement already satisfied: sortedcontainers in /usr/local/lib/python3.6/dist-packages (from pdfminer.six->pattern)
Requirement already satisfied: singledispatch in /usr/local/lib/python3.6/dist-packages (from nltk->pattern)
Requirement already satisfied: cheroot>=6.2.4 in /usr/local/lib/python3.6/dist-packages (from cherrypy->pattern)
Requirement already satisfied: zc.lockfile in /usr/local/lib/python3.6/dist-packages (from cherrypy->pattern)
Requirement already satisfied: more-itertools in /usr/local/lib/python3.6/dist-packages (from cherrypy->pattern)
Requirement already satisfied: portend>=2.1.1 in /usr/local/lib/python3.6/dist-packages (from cherrypy->pattern)
Requirement already satisfied: backports.functools-lru-cache in /usr/local/lib/python3.6/dist-packages (from cheroot>=6.2.4->cherrypy->pattern)
Requirement already satisfied: setuptools in /usr/lib/python3/dist-packages (from zc.lockfile->cherrypy->pattern)
Requirement already satisfied: tempora>=1.8 in /usr/local/lib/python3.6/dist-packages (from portend>=2.1.1->cherrypy->pattern)
Requirement already satisfied: jaraco.functools>=1.20 in /usr/local/lib/python3.6/dist-packages (from tempora>=1.8->portend>=2.1.1->cherrypy->pattern)
Requirement already satisfied: pytz in /usr/local/lib/python3.6/dist-packages (from tempora>=1.8->portend>=2.1.1->cherrypy->pattern)
root@a3506a595f72:~# python3
Python 3.6.7 (default, Oct 22 2018, 11:32:17)
[GCC 8.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pattern.web import Document
>>> ss='''<!DOCTYPE html><a></a>'''
>>> ss
'<!DOCTYPE html><a></a>'
>>> aaz=Document(ss)
>>> aaz.children
[Text('html'), Element(tag='html')]
>>>
Otherwise the parse was fine without error. How did you build/install pattern
module locally?
Yeah. You are right. The pattern3 library I used was different. I used the dev branch from pattern library and it worked for me. At the time I was using it, it wasn't available on pip.
This problem can be reproduced as
gives an error
Updating the string to
ss='''<a></a>'''
do not gives error.