Closed posixeleni closed 10 years ago
Phil will need to update schema.xml once I have committed my changes to git.
Yes. Please branch from the tip of master before you commit. I'll add a commit on top of that branch with a new schema.xml. Then I'll merge it into master so people will get both commits at once on master.
@posixeleni I don't see anything about "In the not-yet-released brand new Geospatial metadata block I have planned to allow for multiple entries for Geographic Coverage to prevent issues like you brought up" so this just a reminder about this. It's for Data Deposit API backwards compatibility. Here's what I had written originally:
geographicCoverage doesn't allowmultiples
Eleni, in DVN 3.6 we treated this as a multi-valued field:
<dcterms:coverage>United States</dcterms:coverage>
<dcterms:coverage>Canada</dcterms:coverage>
Should we change geographicCoverage to allowmultiples? Otherwise, I get this error: multiple values encountered for non multiValued field geographicCoverage: [United States, Canada]
My temporary work around is to comment out the line for Canada in Atom entry XML: https://github.com/IQSS/dataverse/blob/0b65611b1ed3b5143e3524b74a824acd01d21ff1/scripts/api/data-deposit/data/atom-entry-study.xml#L32
@pdurbin thanks for reminding me to include this for QA to test. I have in fact set it to "allow for multiple" as TRUE.
@posixeleni great. Yes, I see it reflected above now. Thanks.
Note to self to think about removing distributorContact from schema.xml per https://github.com/IQSS/dataverse/issues/759#issuecomment-49785898
@posixeleni I just merged some bug fixes from master into https://github.com/IQSS/dataverse/tree/754-metadata so if you pull the latest from that branch, you should be able to run vagrant up
and poke around. Once you're happy, please give this ticket back to me and I'll merge the branch into master and send out a note that a new Solr schema.xml is required. @scolapasta in that note I'm also going to recommend a database drop as well.
There's another change required that affects SWORD but I may need help from @landreev to figure it out. I thought making this change to scripts/database/reference_data.sql would be enough, but I still can't populate the dcterms:coverage field:
-INSERT INTO foreignmetadatafieldmapping (id, foreignfieldxpath, metadatablockname, datasetfieldname, isattribute, parentfieldmapping_id, foreignmetadataformatmapping_id) VALUES (12, ':coverage', 'socialscience', 'geographicCoverage', FALSE, NULL, 1 ); +INSERT INTO foreignmetadatafieldmapping (id, foreignfieldxpath, metadatablockname, datasetfieldname, isattribute, parentfieldmapping_id, foreignmetadataformatmapping_id) VALUES (12, ':coverage', 'geospatial', 'geographicCoverage', FALSE, NULL, 1 );
@landreev to make this more concrete, from the GUI, I can create both of the fields below ("authorName" from author and "country" from "geographicCoverage") but by using the importXML method I can only create the former. "country" has a typeClass of controlledVocabulary so maybe that's why? I did notice this comment that says "A controlled vocabulary entry... not supported yet; though I expect the commented-out code below to work" at https://github.com/IQSS/dataverse/blob/2ef3f55ecc9efe5a046118a8ea1d1405f8e1aa17/src/main/java/edu/harvard/iq/dataverse/metadataimport/ForeignMetadataImportServiceBean.java#L229 . This might be a red herring though because I can't seem to get that block to execute. It may be something else.
authorName from author
{
"value" : [
{
"authorName" : {
"value" : "Peets, John",
"typeClass" : "primitive",
"typeName" : "authorName",
"multiple" : false
}
},
{
"authorName" : {
"value" : "Stumptown, Jane",
"typeClass" : "primitive",
"typeName" : "authorName",
"multiple" : false
}
}
],
"typeClass" : "compound",
"typeName" : "author",
"multiple" : true
}
country from geographicCoverage
{
"value" : [
{
"country" : {
"value" : "United States",
"typeClass" : "controlledVocabulary",
"typeName" : "country",
"multiple" : false
}
},
{
"country" : {
"value" : "Canada",
"typeClass" : "controlledVocabulary",
"typeName" : "country",
"multiple" : false
}
}
],
"typeClass" : "compound",
"typeName" : "geographicCoverage",
"multiple" : true
}
Ah ha! If I use this SQL instead (putting the values in "otherGeographicCoverage")...
-INSERT INTO foreignmetadatafieldmapping (id, foreignfieldxpath, metadatablockname, datasetfieldname, isattribute, parentfieldmapping_id, foreignmetadataformatmapping_id) VALUES (12, ':coverage', 'socialscience', 'geographicCoverage', FALSE, NULL, 1 ); +INSERT INTO foreignmetadatafieldmapping (id, foreignfieldxpath, metadatablockname, datasetfieldname, isattribute, parentfieldmapping_id, foreignmetadataformatmapping_id) VALUES (12, ':coverage', 'geospatial', 'otherGeographicCoverage', FALSE, NULL, 1 );
... the importXML method is able to save the field:
{
"value" : [
{
"otherGeographicCoverage" : {
"value" : "United States",
"typeClass" : "primitive",
"typeName" : "otherGeographicCoverage",
"multiple" : false
}
},
{
"otherGeographicCoverage" : {
"value" : "Canada",
"typeClass" : "primitive",
"typeName" : "otherGeographicCoverage",
"multiple" : false
}
}
],
"typeClass" : "compound",
"typeName" : "geographicCoverage",
"multiple" : true
}
Here's how it looks in the GUI:
@posixeleni is this what you had in mind for dcterms:coverage? Put it under "Other"? Because how can we know if they mean country, state, city, etc.? I'm guessing dcterms:coverage (like the rest of dcterms) can mean various things.
@pdurbin Yes! We are using otherGeographicCoverage to be the geographic catch-all element for dcterms:coverage since it can mean various things (continent, nation, state, city, province, region, canton, etc....).
cc/ @landreev
Yes! We are using otherGeographicCoverage to be the geographic catch-all element for dcterms:coverage
Perfect. Thanks, @posixeleni! I just committed the fix to this branch.
Oh and http://www.diffkit.org is the tool I mentioned that might help people like me and @sekmiller reason about what changed in the TSV files (such as in 8c6df96). "DiffKit is an application, and a framework, for comparing two tables of data, field-by-field." I learned about it from @leeper at https://twitter.com/thosjleeper/status/445582381488287744
Now let's get you running with Vagrant so you can poke around in the live app.
Assigned to @scolapasta to help me review next steps to test and implement for next Beta release.
cc/ @pdurbin
@posixeleni I just spotted a typo: https://github.com/IQSS/dataverse/commit/8c6df969eb3e365e454966e543c9cfd74123835e#commitcomment-7269300
I ran git cherry-pick
of TSV commits from @posixeleni and made a single commit comprising all the fixes in this issue.
I moved this ticket to QA. Here's what I plan to email around:
Due to metadata changes made in https://github.com/IQSS/dataverse/issues/754 everyone running Dataverse 4.0 code must drop their database, update to the latest Solr schema.xml, and clear their
After pulling the latest code, update your Solr schema.xml like this:
Clear out your Solr index like this:
Drop your database and get set up again per the dev guide:
Tested prior to beta4 release, opened separate tickets for issues found.
Closing ticket
Making several changes to metadata blocks based on user-feedback. Phil will need to update schema.xml once I have committed my changes to git.
Geospatial metadata block (New Metadata Block- updated datasetfields.sh for this!) https://github.com/IQSS/dataverse/issues/482
Updates to: Citation block
Changes to Social Science block
Minor Changes to Astrophyiscs Block