mcs07 / ChemDataExtractor

Automatically extract chemical information from scientific documents
http://chemdataextractor.org
MIT License
287 stars 112 forks source link

type object 'HTMLAwareEntitySubstitution' has no attribute 'preserve_whitespace_tags' #17

Open OlgaGKononova opened 7 years ago

OlgaGKononova commented 7 years ago

Hello. I have installed ChemDataExtractor with pip. Now trying to use it in jupyter notebook with python3.5, calling import chemdataextractor. Getting the following error: AttributeError Traceback (most recent call last)

in () ----> 1 import chemdataextractor /usr/local/lib/python3.5/dist-packages/chemdataextractor/__init__.py in () 24 25 ---> 26 from .doc.document import Document /usr/local/lib/python3.5/dist-packages/chemdataextractor/doc/__init__.py in () 13 from __future__ import unicode_literals 14 ---> 15 from .document import Document 16 from .text import Text, Title, Heading, Paragraph, Footnote, Citation, Caption, Sentence, Span, Token 17 from .figure import Figure /usr/local/lib/python3.5/dist-packages/chemdataextractor/doc/document.py in () 22 23 from ..utils import python_2_unicode_compatible ---> 24 from .text import Paragraph, Citation, Footnote, Heading, Title 25 from .table import Table 26 from .figure import Figure /usr/local/lib/python3.5/dist-packages/chemdataextractor/doc/text.py in () 20 21 from ..model import ModelList ---> 22 from ..parse.context import ContextParser 23 from ..parse.cem import ChemicalLabelParser, CompoundHeadingParser, CompoundParser, chemical_name 24 from ..parse.table import CaptionContextParser /usr/local/lib/python3.5/dist-packages/chemdataextractor/parse/__init__.py in () 13 from __future__ import unicode_literals 14 ---> 15 from .actions import join, merge, strip_stop, fix_whitespace 16 from .elements import W, I, R, T, H 17 from .elements import Any, Word, Tag, IWord, Regex, Start, End, Hide, Not /usr/local/lib/python3.5/dist-packages/chemdataextractor/parse/actions.py in () 18 from lxml.etree import strip_tags 19 ---> 20 from ..text import HYPHENS 21 22 /usr/local/lib/python3.5/dist-packages/chemdataextractor/text/__init__.py in () 15 import unicodedata 16 ---> 17 from bs4 import UnicodeDammit 18 19 /usr/local/lib/python3.5/dist-packages/bs4/__init__.py in () 33 import warnings 34 ---> 35 from .builder import builder_registry, ParserRejectedMarkup 36 from .dammit import UnicodeDammit 37 from .element import ( /usr/local/lib/python3.5/dist-packages/bs4/builder/__init__.py in () 226 227 --> 228 class HTMLTreeBuilder(TreeBuilder): 229 """This TreeBuilder knows facts about HTML. 230 /usr/local/lib/python3.5/dist-packages/bs4/builder/__init__.py in HTMLTreeBuilder() 232 """ 233 --> 234 preserve_whitespace_tags = HTMLAwareEntitySubstitution.preserve_whitespace_tags 235 empty_element_tags = set([ 236 # These are from HTML5. AttributeError: type object 'HTMLAwareEntitySubstitution' has no attribute 'preserve_whitespace_tags' Is there a way to fix it? I tried to apply solutions regarding to BeautifulSoup4 installations, suggested elsewhere, but nothing works. Thank you!