Closed eaquigley closed 10 years ago
Original Redmine Comment Author Name: Philip Durbin (@pdurbin) Original Date: 2014-05-19T20:51:06Z
Assigning to Leonid based on discussion with him and Gustavo.
I'm actively testing the importXML method with SWORD. Please look for "572" (this ticket number) in this Google Doc for related issues: https://docs.google.com/document/d/11DpdKyp1tagmaJAAzRqQBEZEZ69WOOOYsoqhz8UsfNM/edit?usp=sharing
@landreev as I just mentioned, I was surprised to discover that I can put "foo" in the metadatablockname
column for a field and importXML still just works fine.
I discovered this because @posixeleni moved kindOfData
from one block to another in 596d82c8934846807968fb7f5a24c91edcc7ec7e in #754 and I assumed I'd need to update the INSERT INTO foreignmetadatafieldmapping
line in the SQL reference data script. It turns out I didn't need to... but now I'm wondering... if that metadatablockname
column isn't even being used, should we simply remove it? Otherwise, that data is going to get stale.
Still need to look into this... I thought it made sense to look up the fields by both the name of the field and the block name... Which apparently my code isn't doing at this point. But rather than removing the metadtatablockname column, I would really rather fix it so that it does look up on both. I mean, it seems wrong to assume that field names are unique across metadatablocks.
I mean, it seems wrong to assume that field names are unique across metadatablocks.
Maybe. For Advanced Search to work, I assume field names are unique. @posixeleni changed type
to astroType
for me. If this assumption is problematic let's hash it out sooner rather than later. Personally, I'll like to see the uniqueness of field names get enforced while the tsv files are being loaded.
That said, your proposed fix to the importXML
method seems fine. Belt and suspenders I guess... the lookup on both. I just don't want a column that isn't used at all.
OK, so we are talking about the metadatablockname column. No, it is not being used. And yes, it is redundant. Because yes, metadata field names are guaranteed to be unique across all metadata blocks. I thought initially we were talking about the formatname column, in the metadata mapping itself; where uniqueness is not guaranteed. Where my comment from DatasetFieldService applies:
/* * Similar method for looking up foreign metadata field mappings, for metadata * imports. for these the uniquness of names isn't guaranteed (i.e., there * can be a field "author" in many different formats that we want to support), * so these have to be looked up by both the field name and the name of the * foreign format. */ public ForeignMetadataFieldMapping findFieldMapping(String formatName, String pathName) { ... }
OK, I'm going ahead and removing the column, as redundant. Do note that we are junking this implementation of foreign metadata import. But there is a chance that the class and the table ForeignMetadataFieldMapping that I created could still be used there; likely with lots of additions/modifications. So that's the only reason I'm still willing to spend any time maintaining it.
I'm putting this ticket into QA. The only QA applicable would be to try some SWORD ingest test that was working before; and check if it's still working. No changes in functionality/logic were actually made. I only dropped one column from one db table that was not being used. (a db update isn't necessary either - if there's a column in a table that's no longer being used, ejb is ok with that)
a db update isn't necessary either
@landreev yes, but we need to update the reference data script. I'll steal this ticket from QA.
we need to update the reference data script
Fixed in 7e9c3d1. Passing to QA.
pulled the latest, dropped the db the reference data script fix worked.
Creating a dataset and adding a file (png) via sword worked ok.
Author Name: Philip Durbin (@pdurbin) Original Redmine Issue: 3991, https://redmine.hmdc.harvard.edu/issues/3991 Original Date: 2014-05-19 Original Assignee: Leonid Andreev
Given an SWORDv2 Atom Entry with a Dataverse-specific "attribute hack" for "dcterms:isReferencedBy", we need a method that will produce a dataset.
We use the extra attributes in dcterms:isReferencedBy (holdingsURI, agency, and IDNo) to link back to journal articles in Open Journal Systems (OJS) as described at http://www.mail-archive.com/sword-app-tech@lists.sourceforge.net/msg00386.html
Here's the crosswalk that's used in DVN 3.x: https://github.com/IQSS/dvn/blob/master/working_directory/dcmi_terms2ddi.xsl
(Not that the new solution needs to be a crosswalk.)
We might need to think some more about required fields, such as "Subject" and "Contact E-mail". Should we silently fill in "Other" for "Subject". Should we silently fill in the contact email of the parent dataverse for "Contact E-mail"? "Author" and "Description" are also required... should the method not create a dataset object if dcterms:creator and dcterms:description are not populated?
It's unclear if "dcterms:identifier" should be supported or not. In DVN 3.x, it allowed you to modify the globalId of a study under some conditions.
(For a more generic SWORDv2 Atom Entry example, see http://swordapp.github.io/SWORDv2-Profile/SWORDProfile.html#protocoloperations_editingcontent_metadata )
Redmine related issue(s): 3385