Closed krishmatta closed 3 years ago
Hi @krishxmatta thanks so much for your issue. Yes, this is definitely possible. After you parse the PubMed OA dataset, you can group the output list by section
from the output you got. I'd try something as follows:
from itertools import groupby
from operator import itemgetter
grouped_paragraphs = list(groupby(paragraphs, key=itemgetter('section')))
But you might have to see that the value in the key section
is consistent or not from the parser.
Hey! Thank you for the project. Is there any way I can extract an entire section (e.g. the entire Introduction section) rather than paragraph by paragraph (which may include "subsections")?