CBIIT / nci-ctd2-dashboard-tmpl-helper

NCI CTD^2 Dashboard Template Helper
3 stars 1 forks source link

Support autoretrieval of formatted reference from PMID. #46

Open kcs3 opened 4 years ago

kcs3 commented 4 years ago

A comment initially entered into the Dashboard Github was "Support autoretrieval of formatted reference from PMID." It was issue https://github.com/CBIIT/nci-ctd2-dashboard/issues/415.

There is an API to retrieve a formatted journal reference from PubMed given the PubMed ID. This is a request to implement this feature in the submission builder.

The formatted journal reference only appears in the Description text field of the publication_reference column. So we will need to design how this is going to be implemented.

vdancik commented 4 years ago

Python script to format reference display text:

import sys
from urllib.request import urlopen

def main():
    if len(sys.argv) < 2:
        print('ERROR: PMID not specified', file=sys.stderr)
        sys.exit(2)
    nAuthors = 0
    pmid = sys.argv[1]
    print('pmid = '+pmid)
    url = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=text&rettype=medline&id='+pmid
    print('url = '+url+'\n')
    response = urlopen(url)
    for input in response:

        line = input.decode('utf-8')
        print(line.strip('\n'))
        if line.startswith('    '):
            #print('last prefix = "'+lastPrefix+'"')
            if lastPrefix == 'TI  ':
                title = title + ' ' + line.strip()
            if lastPrefix == 'SO  ':
                reference = reference + ' ' + line.strip()
        lastPrefix = line[0:4]
        if line.startswith('TI  - '):
            #print('TITLE = '+line)
            title = line[6:].strip()
        if line.startswith('SO  - '):
            #print('REF = '+line)
            reference = line[6:].strip()
        if line.startswith('AU  - '):
            #print('AUTHOR = '+line)
            nAuthors = nAuthors + 1
            if nAuthors == 1:
                firstAuthor = line[6:].strip()
            if nAuthors == 2:
                secondAuthor = line[6:].strip()

    author = firstAuthor
    if nAuthors > 2:
        author = author + ' et al.'
    if nAuthors == 2:
        author = author + ' and ' + secondAuthor
    print('\nAUTHOR = '+author)
    print('TITLE = '+title)
    print('REF = '+reference)
    print(author+': '+title+' '+reference)

if __name__ == '__main__':
    main()