INFO: Loading scispacy
0%| | 0/17142 [00:00<?, ?it/s]/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scispacy/abbreviation.py:216: UserWarning: [W036] The component 'matcher' does not have any patterns defined.
global_matches = self.global_matcher(doc)
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17142/17142 [06:22<00:00, 44.80it/s]
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/bin/docanalysis", line 8, in
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/docanalysis.py", line 196, in main
calldocanalysis.handlecli()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/docanalysis.py", line 188, in handlecli
self.entity_extraction.extract_entities_from_papers(args.project_name, args.dictionary, search_sections=args.search_section, entities=args.entities, query=args.query, hits=args.hits,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/entity_extraction.py", line 170, in extract_entities_from_papers
compiled_terms = self.get_terms_from_ami_xml(terms_xml_path[i])
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/entity_extraction.py", line 563, in get_terms_from_ami_xml
tree = ET.parse(xml_path)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/xml/etree/ElementTree.py", line 1202, in parse
tree.parse(source, parser)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/xml/etree/ElementTree.py", line 595, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 701, column 249
INFO: Loading scispacy 0%| | 0/17142 [00:00<?, ?it/s]/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/scispacy/abbreviation.py:216: UserWarning: [W036] The component 'matcher' does not have any patterns defined. global_matches = self.global_matcher(doc) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17142/17142 [06:22<00:00, 44.80it/s] Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.8/bin/docanalysis", line 8, in
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/docanalysis.py", line 196, in main
calldocanalysis.handlecli()
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/docanalysis.py", line 188, in handlecli
self.entity_extraction.extract_entities_from_papers(args.project_name, args.dictionary, search_sections=args.search_section, entities=args.entities, query=args.query, hits=args.hits,
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/entity_extraction.py", line 170, in extract_entities_from_papers
compiled_terms = self.get_terms_from_ami_xml(terms_xml_path[i])
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/docanalysis/entity_extraction.py", line 563, in get_terms_from_ami_xml
tree = ET.parse(xml_path)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/xml/etree/ElementTree.py", line 1202, in parse
tree.parse(source, parser)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/xml/etree/ElementTree.py", line 595, in parse
self._root = parser._parse_whole(source)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 701, column 249