Open dazza-codes opened 7 years ago
The history on this is that the Casalini identifiers used to be in the form "it xxxxxx" and now they are "itxxxxx" without the space.
So, yes, you'll see both forms. Does the converter have to translate the space to a _? If that's our only option, AND if it's going to be a problem if we need that to get back to the Symphony record, then we should probably (some day) clean up the Symphony records and get rid of the space.
In general, a space in a unix filename is very awkward to work with because it always needs to be escaped and/or quoted. At present, the conversion from MARC to XML is writing out a file for each record and the 001 field was chosen as a way to output unique file names. Given that choice and the unix difficulty with spaces in file names, the 001 field is only modified by replacing a space with an underscore.
Another slight anomaly - there is a record identifier with an x
at the end from the casalini0.mrc
file, i.e.
/ld4p_data/Dataload/LD4P/MarcXML/it_9932055x.xml
There might be other identifier anomalies.
e.g. most identifiers beginning with
it
have a space in them, which is translated to_
in the file name, but this one forit15112659
does not: