eXtensibleCatalog / Metadata-Services-Toolkit

Tools for processing and aggregating metadata
Other
6 stars 3 forks source link

Do we need 'accurate' leader information in new output record? #366

Open patrickzurek opened 8 years ago

patrickzurek commented 8 years ago

JIRA issue created by: jbrand Originally opened: 2012-07-18 08:15 AM

Issue body:

One other question/observation: It seems that the Leader information for the "record of source" is copied as "static content" into the Output Record. This made it handy for helping me to identify which of the set actually was the record of source, but since the Output Record is a new record and it will almost certainly contain more data than the "record of source" by virtue of the fact that it will have additional fields like 035s for other records added into it, it seems odd that the Output Record would maintain the Leader/00-04 byte value. If the Leader byte size is now not going to be accurate for the new Output Record, will that cause any problems elsewhere? What if these records are extracted or turned back into MARC?

patrickzurek commented 8 years ago

JIRA Coment by user: jbrand JIRA Timestamp: 2012-07-18 11:10 AM

Comment body:

The above was a concern raised by [~jgibson]. [~jbowen] comments? Certainly we could generate an updated byte size based on the xml string byte size for the new output record. Would this be 'more' appropriate?

patrickzurek commented 8 years ago

JIRA Coment by user: jbowen JIRA Timestamp: 2012-07-18 01:21 PM

Comment body:

Yes, I think the byte size in the leader in the output record should be updated so that it is accurate. Good catch, [~jgibson]

patrickzurek commented 8 years ago

JIRA Coment by user: rcook JIRA Timestamp: 2012-07-23 02:56 PM

Comment body:

Note, there are two related issues, this one is for the leader update in MAS and issue 544 for Norm.

To note however....in issue 544 to update Norm byte, then you are going to have a Norm output record being the biggest, which is the same as what John is currently picking. What you really want is the byte size of the Norm input records to be used, not the Norm output. Right?

patrickzurek commented 8 years ago

JIRA Coment by user: jbowen JIRA Timestamp: 2012-08-24 11:01 AM

Comment body:

I think what we want is that the Leader of the OUTPUT record accurately reflects the size of that output record.

patrickzurek commented 8 years ago

JIRA Coment by user: rcook JIRA Timestamp: 2013-01-30 11:49 AM

Comment body:

[~jgibson] Please advise on the relative importance of this for you so that it can be prioritized.

patrickzurek commented 8 years ago

JIRA Comment by user: Jessica Gibson (gibsonjc) JIRA Timestamp: 2013-02-05 03:48 PM

Comment body:

I'm not sure how important this is, I just noticed it happening and wondered if this was desired for the static content. I don't really know enough about MARCXML to know what the number should or should not be. If the Leader in a MARCXML record is supposed to reflect the XML record's current size, then it seems to me that the number should be updated in the output record generated by any service that is producing MARCXML records.

As far as I know, MAS Merging uses the XML record size to determine the Record of Source; it was already written this way and we didn't change it. (I believe that is the XML record size of the record after going through Normalization since Norm precedes Aggregation.) I believe that the XML record size is somewhere internal to the system and not displayed in the MST as I can never find the Size that is shown in the MAS logs anywhere in the record itself. For example, when the log says * Matchset In: { 2447, [LDR/17: ; Size: 11105] [Record of Source**] 2455, [LDR/17: ; Size: 11048] }, Out: 2523 the numbers 11105 and 11048 aren't anywhere in the record that I can see browse in MST.

(Just for kicks and my own personal interest, today I took a MARCXML record from the MST that had gone through Normalization and Aggregation and converted it to MARC again using the MarcEdit program. MarcEdit seems to be smart enough to update the Leader size for the MARC record during that conversion process.)