scriptotek / mc2skos

Command line script for converting Marc21 Classification and Authority records to SKOS/RDF
The Unlicense
21 stars 4 forks source link

Don't stop processing if captions are found in 153 #37

Closed nichtich closed 6 years ago

nichtich commented 6 years ago

In https://github.com/scriptotek/mc2skos/blob/master/mc2skos/record.py#L513 processing of field 153 is stopped when a caption is found. I have some classification records with subfields j followed by e so they cannot be processed. I changed in my branch to

if code in ['h', 'j', 'k', '6', '8']:
    # Ignore captions
    continue

elif code not in ['a', 'c', 'e', 'f', 'z', 'y']:
    # We expect everything else to be captions or notes, like in the example in
    # test_153::TestParse153::testComplexEntryWithUndocumentStuff
    break

and this passes current test. But why break at all? Could we change the break to continue and remove the test?

danmichaelo commented 6 years ago

Agree that the solution is not a good one. A better solution could be to only add $c if it follows $a. Then the test should still pass without any break/continue stuff.

But if the example in testComplexTableEntryWithUndocumentStuff is really invalid MARC21 Classification, it would also be better to have it fixed and remove the test. I will start asking the NO editorial team what they think about the example.

danmichaelo commented 6 years ago

The example used in the test has since been fixed, so I think it's safe to remove this. And if I find new cases, I can try a better fix like adding $c only if it follows `$a