Using this data, there's a number of things that can be done:
Create new datasets from the supplementary materials files
Create Dataset -> Publication -> Grant linkages for the existing datasets in the NDE. If a dataset has a citation listed, augment the existing metadata by attaching the funding provided by PMC.
Parse data availability statements to figure out where the data is and link to existing datasets.
Use regex parsing to mine the text of the document to find similar linkages between a small subset of repos w/ consistent identifiers and publications. Ideally would disambiguate between primary citations (data generation) and secondary citations (publications which reuse the data).
Add additional Dataset -> Publication -> Grant linkages via PMC "related information" structured metadata
Pull all publication metadata from PMC OAI-PMH or from bulk open access data or APIs.
https://www.ncbi.nlm.nih.gov/pmc/oai/oai.cgi?verb=GetRecord&identifier=oai:pubmedcentral.nih.gov:8313480&metadataPrefix=pmc
Using this data, there's a number of things that can be done:
Create new datasets from the supplementary materials files
Create Dataset -> Publication -> Grant linkages for the existing datasets in the NDE. If a dataset has a
citation
listed, augment the existing metadata by attaching thefunding
provided by PMC.Parse data availability statements to figure out where the data is and link to existing datasets.
Use regex parsing to mine the text of the document to find similar linkages between a small subset of repos w/ consistent identifiers and publications. Ideally would disambiguate between primary citations (data generation) and secondary citations (publications which reuse the data).
Add additional Dataset -> Publication -> Grant linkages via PMC "related information" structured metadata