mcs07 / ChemDataExtractor

Automatically extract chemical information from scientific documents
http://chemdataextractor.org
MIT License
287 stars 112 forks source link

Fix issues with reference link extraction using HTML/XML readers #10

Closed mcs07 closed 7 years ago

mcs07 commented 7 years ago

Reference links with an immediate child element (e.g. sup) return None for their text property, raising an exception on the strip() call. Instead, use itertext() to get all nested text and join.