Closed Twade968 closed 1 year ago
refs #409
<?xml version="1.0"?>
<resource xsi:schemaLocation='http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4/metadata.xsd' xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xmlns='http://datacite.org/schema/kernel-4'>
<identifier identifierType='DOI'>10.34770/gg40-tc15</identifier>
<creators>
<creator>
<creatorName>Leach, Robert</creatorName>
<givenName>Robert</givenName>
<familyName>Leach</familyName>
</creator>
<creator>
<creatorName>Hecht, Michael</creatorName>
<givenName>Michael</givenName>
<familyName>Hecht</familyName>
</creator>
<creator>
<creatorName>Karas, Christina</creatorName>
<givenName>Christina</givenName>
<familyName>Karas</familyName>
</creator>
</creators>
<titles>
<title>CKavity Library: Next-Generation Sequencing</title>
<title titleType='AlternativeTitle'>
A library of novel genes with combinatorially diverse cavities, built on a
stably folded structural template
</title>
</titles>
<publisher>Princeton University</publisher>
<resourceType resourceTypeGeneral='Dataset'/>
<publicationYear>2019</publicationYear>
<relatedIdentifiers>
<relatedIdentifier relationType='IsIdenticalTo' relatedIdentifierType='ARK'>ark:/88435/dsp015999n626m</relatedIdentifier>
</relatedIdentifiers>
<version>1</version>
<rightsList>
<rights rightsURI='https://creativecommons.org/licenses/by/4.0/' rightsIdentifier='CC BY'>Creative Commons Attribution 4.0 International</rights>
</rightsList>
<descriptions>
<description descriptionType='Other'>
Protein sequence space is vast; nature uses only an infinitesimal fraction
of possible sequences to sustain life. Are there solutions to biological
problems other than those provided by nature? Can we create artificial
proteins that sustain life? To investigate this question, the Hecht lab
has created combinatorial collections, or libraries, of novel sequences
with no homology to those found in living organisms. These libraries were
subjected to screens and selections, leading to the identification of
sequences with roles in catalysis, modulating gene regulation, and metal
homeostasis. However, the resulting functional proteins formed dynamic
rather than well-ordered structures. This impeded structural
characterization and made it difficult to ascertain a mechanism of action.
To address this, Christina Karas's thesis work focuses on developing
a new model of libraries based on the de novo protein S-824, a four-helix
bundle with a very stable three-dimensional structure. The first part of
this research focused on mutagenesis of S-824 and characterization of the
resulting proteins, revealing that this scaffold tolerates amino acid
substitutions, including buried polar residues and the removal of
hydrophobic side chains to create a putative cavity. Distinct from
previous libraries, Karas targeted variability to a specific region of the
protein, seeking to create a cavity and potential active site. The second
part of this work details the design and creation of a library encoding
1.7 x 10^6 unique proteins, assembled from degenerate oligonucleotides.
The third and fourth parts of this work cover the screening effort for a
range of activities, both in vitro and in vivo. I found that this
collection binds heme readily, leading to abundant peroxidase activity.
Hits for lipase and phosphatase activity were also detected. This work
details the development of a new strategy for creating de novo sequences
geared toward function rather than structure.
</description>
</descriptions>
</resource>
@matthewjchandler Would you please review the attached Datacite record and let me know if it looks ok and if I have addressed all of your concerns. I see that there is some weirdness around how apostrophes are being recorded. I have already a ticket to address that here: https://app.zenhub.com/workspaces/rdss-workcycles-61a4f1a12a399b001730f65a/issues/pulibrary/pdc_describe/601
This record does not have a DOI (that PRDS has minted), and the DOI given in the XML above is for a different record (http://arks.princeton.edu/ark:/88435/dsp01rj4307478). Otherwise, everything else looks as I would expect. Thanks @Twade968 !
We need a design decision about how we will handle migrating works that do not have a doi.
This is the original record that we are migrating: https://dataspace.princeton.edu/handle/88435/dsp015999n626m?mode=full
Acceptance Criteria
Notes from Matt
there are several issues here. Let me preface everything by saying we (PRDS) did not curate this one, and I'm not sure which submission form was used for it.
I'm looking at the full item record to understand it better: https://dataspace.princeton.edu/handle/88435/dsp015999n626m?mode=full
The dc.contributor.author field was not used. Instead, there are three dc.creator fields in the original record that should be translated to the DataCite Creator field. One of the names listed as a creator was repeated as a contributor, and we do not want to replicate that error. In addition, we typically do not input funding agencies as contributors, so the Creator field you have for the National Science Foundation should be omitted.
The original record has a title as well as an alternative title, and the title field already includes a subtitle, so I do not think that the two title fields should be merged in DataCite. Instead, the alternative title from the original record can be translated as a second title with the type "AlternativeTitle" (see https://support.datacite.org/docs/datacite-metadata-schema-v44-mandatory-properties#3-title).
For some reason, this record does not have an issue date (which is generally required). However, I can see that the dc.date.accessioned is in 2019, so the DataCite record should have the Publication Year field set to 2019 instead of 2020.
While the original record does have the publisher set to "Princeton University Lewis-Sigler Institute", our current practice is to correct such entries to "Princeton University".
The original record has two description fields filled: abstract and table of contents. My understanding is that for the purposes of migration, we are copying over abstracts as Description Type "Other" and omitting other description fields for now.