elifesciences / elife-pubmed-feed

code to support uploading feeds to pubmed for POA articles and VOR articles
1 stars 4 forks source link

Structured abstracts #67

Closed gnott closed 3 years ago

gnott commented 6 years ago

Please see an example Pubmed page for one of the article XML to potentially use in non-eLife test cases at https://www.ncbi.nlm.nih.gov/pubmed/?term=10.1136%2Fbmjopen-2013-003269

I believe the bold headings are the result of a deposit with a structured abstract: https://www.ncbi.nlm.nih.gov/books/NBK3828/#publisherhelp.How_should_structured_abst

It may be possible to support the parsing and generation of these structured abstracts.

For the time being, I may have a small fix that will extract the abstract paragraph content even if they are nested inside <sec> tags inside the abstract, as is the case for this BMJOpen sample, which is also compatible with the eLife abstract format.

Any thoughts @Melissa37 on supporting structured abstracts?

gnott commented 6 years ago

@Melissa37 I'm tempted to do this too because it is closely related to the Editorial note label we're adding in #73. Are you ok with it? It should have no impact to eLife, I just need to split abstracts on <sec> tags if present, otherwise by <p> tags.

Melissa37 commented 6 years ago

Good idea! Go for it.

Thanks! M

gnott commented 6 years ago

It would require revisiting the elife-tools parser in order to include the <sec> tags in the BMJ Open sample. Right now, the parser only returns the paragraphs of an abstract. For now, I'm going to send this issue back to the Upcoming column in our project tracker, until it is more of a priority.

Melissa37 commented 6 years ago

Cool, This experiment is going live 25th June, so we should be prepared to go live in production by August.

Melissa37 commented 3 years ago

@gnott this should go back on the table now and we should add it to a sprint?

@FAtherden-eLife FYI

gnott commented 3 years ago

The issue https://github.com/elifesciences/issues/issues/5742 is in the current sprint, the new one so it can appear on the appropriate project boards. I'll be looking at it today or Monday!