Closed csgrant00 closed 5 months ago
This is an ADSIngestParser issue. Text is being extracted from tagged xml using the .get_text() function, which only returns the text contained within tags. We need the elsevier parser to use something similar to _detag where we can select what tags are allowed.
title, keywords and abstract
python run.py -p "/proj/ads/abstracts/data/ELS/CONSYN.AST/ELS.080723/0016-7037/S0016703723X00155/S0016703723003332/S0016703723003332.xml" -t elsevier -f elsevier.test
or fractions
python run.py -p "/proj/ads/abstracts/data/ELS/CONSYN.AST/ELS.080723/0012-821X/S0012821X23X00181/S0012821X23003242/S0012821X23003242.xml" -t elsevier -f elsevier.test