DOI-USGS / fort-pymdwizard

The MetadataWizard is a useful tool designed to facilitate FGDC metadata creation for spatial and non-spatial data sets. It is a cross-platform desktop application built using an open-source Python architecture.
https://usgs.github.io/fort-pymdwizard/
Other
65 stars 22 forks source link

Inject info from IPDS into metadata record #84

Closed afreeman-usgs closed 5 years ago

afreeman-usgs commented 6 years ago

It would be handy to somehow extract information from IPDS (author list, title, abstract, etc.) and inject it into our metadata records.
Before the Sharepoint upgrade to IPDS, we were able to get an entry's author list by doing:

from bs4 import BeautifulSoup
from requests_ntlm import HttpNtlmAuth
import getpass
import requests

username = ""
password = getpass.getpass()

ipds_session = requests.Session()
ipds_session.auth = HttpNtlmAuth('GS\\{}'.format(username.split('@')[0]), password, ipds_session)

ipds_number = ''
authors_url = "https://ipds.usgs.gov/_vti_bin/listdata.svc/IPDSAuthors()?$filter=IPNumber%20eq%20%27{}%27".format(ipds_number)

content = ipds_session.get(authors_url)
soup = BeautifulSoup(content.text, "lxml-xml")
author_list = []
for entry in soup.find_all('entry'):
    record = {}
    record['author_name'] = entry.find('AuthorNameText').string
    record['ORCID'] = str(entry.find('ORCID').string).strip()
    author_list.append(record)
ColinTalbert commented 5 years ago

Not implementing... too USGS specific and brittle to maintain. IPDS API access is evolving