eXtensibleCatalog / test

Testing
MIT License
0 stars 0 forks source link

MAS records in Drupal appear incorrect #89

Closed patrickzurek closed 7 years ago

patrickzurek commented 7 years ago

JIRA issue created by: rcook Originally opened: 2012-12-13 02:45 PM

Issue body:

MAS test harvest to Drupal doesn't appear to be correct. When I look at the XC schema view, I seem to only see Manifestation records.

Peter, could you take a look and see.

This issue has an attachment associated with it (external link): t0_thru_t2.zip

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2012-12-13 02:48 PM

Comment body:

http://xc-zurek.carli.illinois.edu/xc_dev/?q=admin/xc/metadata/statistics There should be

All entities 0 280 280 2 Work 0 12 12 3 New n/a 12 n/a 4 Updated n/a 0 n/a 5 Deleted n/a 0 n/a 6 Expression 0 12 12 7 New n/a 12 n/a 8 Updated n/a 0 n/a 9 Deleted n/a 0 n/a 10 Manifestation 0 12 12 11 New n/a 12 n/a 12 Updated n/a 0 n/a 13 Deleted n/a 0 n/a 14 Holdings 0 244 244 15 New n/a 244 n/a 16 Updated n/a 0 n/a 17 Deleted n/a 0 n/a 18 Dublin Core 0 0 0 19 New n/a 0 n/a 20 Updated n/a 0 n/a 21 Deleted n/a 0 n/a 22 Metadata 0 280 280 23 Nodes 0 12 12 24 Uplinks from 0 268 0 25 New n/a 280 n/a 26 Updated n/a 0 n/a 27 Deleted n/a 0 n/a 28 Types n/a 0 n/a 29 Solr documents 0 0 12

Notify to [~pzurek]

Patrick of you set up an NCIP to point to UB instance, would NCIP work here?

patrickzurek commented 7 years ago

JIRA Comment by user: pkiraly JIRA Timestamp: 2012-12-13 04:33 PM

Comment body:

The problem is described here: http://xc-zurek.carli.illinois.edu/xc_dev/?q=admin/reports/event/189908.

Unsuccessfull load data with LOAD DATA INFILE '/var/www/xc_dev/sites/default/files/oaiharvester_sql_cache/10/xc_entity_relationships.0.csv' INTO TABLE {xc_entity_relationships} CHARACTER SET utf8 FIELDS TERMINATED BY '\0\t' ESCAPED BY '' LINES TERMINATED BY '\0\n'

To solve this problem: 1) go to http://xc-zurek.carli.illinois.edu/xc_dev/?q=admin/xc/harvester/schedule/11/edit 2) click next 3) set the "LOAD DATA LOCAL INFILE" syntax at the bottom of the page

Delete all records, and delete Solr documents as well, and start harvest again from the zero point (reset timestamp first)

If it would not solve the problem, go to the same page, and use "INSERT" syntax.

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-01-14 03:07 PM

Comment body:

[~pzurek] and [~cdelis]

I finally found what I remembered and mentioned in the call this morning.

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-01-14 03:09 PM

Comment body:

Patrick, I would like to move along this testing of bringing MAS records into Drupal, as we talked about this morning, including using Delis' T0, T1, and T2 file sets (Chris, do you want a separate jira issue for that?).

[~pzurek] [~cdelis]

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-01-14 03:11 PM

Comment body:

Patrick, that server seems to be down.

patrickzurek commented 7 years ago

JIRA Comment by user: Chris Delis (cedelis) JIRA Timestamp: 2013-01-14 04:24 PM

Comment body:

I'm attaching the full harvests for each Tn set (where n = 0, 1, 2).

I'm not sure if these are useful for what you need to test Drupal. Would it be better if I supply a T0 full harvest, and instead of supplying full harvests for T1 and T2, should I supply only the differences? E.g., instead of a full harvest T1, should I supply only the updates made since T0; and instead of a full T2, only the updates since T1?

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-01-14 05:25 PM

Comment body:

Yes, full for t0 and then the deltas.

patrickzurek commented 7 years ago

JIRA Comment by user: Chris Delis (cedelis) JIRA Timestamp: 2013-01-15 12:05 PM

Comment body:

Attached is a zip file containing a full harvest T0, T1 deltas, and T2 deltas.

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-01-15 03:11 PM

Comment body:

Patrick, as you work with this, please bring [~mwesley] and [~pkiraly] in as needed.

patrickzurek commented 7 years ago

JIRA Comment by user: pkiraly JIRA Timestamp: 2013-01-15 03:27 PM

Comment body:

[~pzurek] I can not access this site. Can you give us view access? The zip contains MARC records, I guess it is harvested by MST, not Drupal, right?

[~pkiraly] The zip file contains OAI PMH records to be consumed by Drupal. The first file is a full harvest (T0) and the next two files are deltas representing T1 and T2.

[~pzurek] The file contains MARC XML records (.../marc:record), and the Drupal Toolkit is not able to consume MARC XML. In XC it it MST which consumes MARC XML, and transform it to XC schema records. Drupal consumes only those transformed XC schema records (and Dublin Core records as well), but not MARC XML. I don't know the history of this project, but I guess something has been mixed.

patrickzurek commented 7 years ago

JIRA Comment by user: Chris Delis (cedelis) JIRA Timestamp: 2013-01-15 03:48 PM

Comment body:

Sorry, [~pkiraly], the T0 set (fullharvest_t0.xml) was indeed plain MARCXML. My mistake. (The T1 and T2 diffs were correct). I uploaded a new t0_thru_t2.zip file that should now contain XC schema records.

patrickzurek commented 7 years ago

JIRA Comment by user: Patrick Zurek (patrickzurek) JIRA Timestamp: 2013-01-15 05:10 PM

Comment body:

The site was indeed down, I had Apache turned off over break. It's back up and will remain up.

patrickzurek commented 7 years ago

JIRA Comment by user: pkiraly JIRA Timestamp: 2013-02-06 04:20 PM

Comment body:

Hi, I wanted to check the site but seems that http://xc-zurek.carli.illinois.edu/xc_dev/ is down, or at least I get a Django error message. [~pzurek] could you give me some instructions how could I check the site?

patrickzurek commented 7 years ago

JIRA Comment by user: Patrick Zurek (patrickzurek) JIRA Timestamp: 2013-02-07 05:53 PM

Comment body:

The site was down for the last week serving a different function. It's back up now and will remain up. I harvested T0 (but not T1 or T2).

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-04-01 12:26 PM

Comment body:

[~pzurek] and [~pkiraly] Where did this testing end up? Last I knew Peter was reporting success and getting different results than Patrick.

patrickzurek commented 7 years ago

JIRA Comment by user: rcook JIRA Timestamp: 2013-06-06 12:14 PM

Comment body:

MAS functionality will not be part of 1.0 release, and perhaps not part of 1.1. Moving forward for now. This involves CARLI involvement using MAS output, so it should be deferred until they are able to participate, which I don't think will occur over the summer due to other work initiatives.