Closed gnott closed 3 years ago
@Melissa37 I'm tempted to do this too because it is closely related to the Editorial note
label we're adding in #73. Are you ok with it? It should have no impact to eLife, I just need to split abstracts on <sec>
tags if present, otherwise by <p>
tags.
Good idea! Go for it.
Thanks! M
It would require revisiting the elife-tools parser in order to include the <sec>
tags in the BMJ Open sample. Right now, the parser only returns the paragraphs of an abstract. For now, I'm going to send this issue back to the Upcoming column in our project tracker, until it is more of a priority.
Cool, This experiment is going live 25th June, so we should be prepared to go live in production by August.
@gnott this should go back on the table now and we should add it to a sprint?
@FAtherden-eLife FYI
The issue https://github.com/elifesciences/issues/issues/5742 is in the current sprint, the new one so it can appear on the appropriate project boards. I'll be looking at it today or Monday!
Please see an example Pubmed page for one of the article XML to potentially use in non-eLife test cases at https://www.ncbi.nlm.nih.gov/pubmed/?term=10.1136%2Fbmjopen-2013-003269
I believe the bold headings are the result of a deposit with a structured abstract: https://www.ncbi.nlm.nih.gov/books/NBK3828/#publisherhelp.How_should_structured_abst
It may be possible to support the parsing and generation of these structured abstracts.
For the time being, I may have a small fix that will extract the abstract paragraph content even if they are nested inside
<sec>
tags inside the abstract, as is the case for this BMJOpen sample, which is also compatible with the eLife abstract format.Any thoughts @Melissa37 on supporting structured abstracts?