Closed misilot closed 4 years ago
Hi Tom,
There are a few ways you could do this. The clearest might be to create a custom call number type subclassed from pycallnumber.units.LC
that doesn't use a cutter, and then add that to the list of call number types that you want to detect using e.g. pycallnumber.callnumber
when you call it.
Your custom class might look something like this, where you slice the groups from the main LC class to exclude the cutter and preceding period.
import pycallnumber as pycn
class LCNoCutter(pycn.units.LC):
lc_groups = pycn.units.LC.template.groups
definition = 'an LC call number without a Cutter'
template = pycn.template.CompoundTemplate(
separator_type=pycn.units.simple.DEFAULT_SEPARATOR_TYPE,
groups=lc_groups[0:1] + lc_groups[3:]
)
Then, depending on what you're expecting in your data, you can create a list of types including LCNoCutter
to pass via the optional unittypes
kwarg when you call the various factory functions, like pycallnumber.callnumber
. (Such as what's described here in the README.) As long as LCNoCutter
appears before pycallnumber.units.Local
, those call numbers should match that type instead of Local
.
For instance, continuing the above code:
# Assume something is a valid LC call number, otherwise maybe an LC
# without the cutter, or treat it as a local call number if all else
# fails:
unittypes = [pycn.units.LC, LCNoCutter, pycn.units.Local]
pycn.callnumber('B355 1899', unittypes=unittypes)
pycn.callnumber('B355 1927a', unittypes=unittypes)
pycn.callnumber('B355 1960', unittypes=unittypes)
The type
of the object you'll get back should be LCNoCutter
, but it will behave like the LC type, just without the Cutter.
I hope that helps!
Thank you this worked great! Now on to figuring out how to sort values.
Is it possible to improve sorting for these call numbers? As these all seem to be being sorted above all the LC values.
Thanks!
Hi @ jthomale it looks like sorting splits it into the different classifications instead of trying to sort everything separately? For example,
Type: <class 'pycallnumber.units.callnumbers.local.Local'> CN: E725.45 1st .W35 1998
Type: <class 'pycallnumber.units.callnumbers.local.Local'> CN: E725.45 10th .U53 1993
Type: <class 'pycallnumber.units.callnumbers.local.Local'> CN: E748v.T2 W5
Type: <class 'pycallnumber.units.callnumbers.local.Local'> CN: E806 .H67a
Type: <class 'pycallnumber.units.callnumbers.lc.LC'> CN: E11 .C691 no.1-2
Type: <class 'pycallnumber.units.callnumbers.lc.LC'> CN: E11 .C691 no.8-11 no.9-10
Type: <class 'pycallnumber.units.callnumbers.lc.LC'> CN: E11 .C691 no.8-11 no.11
Also, is there a way to have the class name show up in the following string instead of Local no matter which custom class it matches on? <class 'pycallnumber.units.callnumbers.local.Local>
Thanks!
Hi Tom,
It looks like there are a couple of things that may be going on here.
First, make sure you're including your custom unittypes
list in every call to pycallnumber.callnumber
. When I started testing out the example list in your last comment, I made the mistake of leaving it out and I got the exact same results you did. However, when I do include it and generate a sorted list, I get better results. (Still not perfect and I'll talk about why in a minute.)
Assuming I'm just continuing the code from my earlier comment:
cn_strings = [
'E725.45 1st .W35 1998',
'E725.45 10th .U53 1993',
'E748v.T2 W5',
'E806 .H67a',
'E11 .C691 no.1-2',
'E11 .C691 no.8-11 no.9-10',
'E11 .C691 no.8-11 no.11'
]
cn_objs = [pycn.callnumber(cn, unittypes=unittypes) for cn in cns]
Then sorted(cn_objs)
yields:
[
<Local 'E748v.T2 W5'>,
<LC 'E11 .C691 no.1-2'>,
<LC 'E11 .C691 no.8-11 no.9-10'>,
<LC 'E11 .C691 no.8-11 no.11'>,
<LCNoCutter 'E725.45 1st .W35 1998'>,
<LCNoCutter 'E725.45 10th .U53 1993'>,
<LCNoCutter 'E806 .H67a'>
]
So, it's an improvement but there are still a couple of oddities caused by irregularities in the data.
E748v.T2 W5
is being interpreted as a Local call number, still. This is because of the v.T2
that follows the LC class number immediately with no spaces or other formatting. If it were E748.T2
then it would interpret .T2
as a cutter. If it were E748 v.T2
then it interpret v.T2
as the item-specific portion. But, as it is, it doesn't know what to do with that. It doesn't follow either the LC
or LCNoCutter
pattern, so it's a local call number.LCNoCutter
:
E725.45 1st .W35 1998
— It looks like .W35
is the cutter, but the errant 1st
between the class and the cutter is throwing it off, and it's treating everything after the class number as the item-specific information.E725.45 10th .U53 1993
— Exact same thing as the previous one.E806 .H67a
— In this case clearly .H67a
is the cutter, and I think the a
is throwing it off because the LC unit type isn't set to recognize work marks for its cutters, which is probably a mistake in pycallnumber. (Dewey cutters can include work marks, so not allowing that for LC cutters was probably an oversight on my part.)For the E748
and E725
call numbers, they make me think of what sometimes happens when LC call numbers are formatted into columns for spine labels and then they're recombined.
Also, yes—different types of call numbers when sorted in the same list will have a tendency to group together if the different call number types have different rules for how their sort keys are generated. If you're curious to see exactly why things are sorting the way they are, you can call the for_sort
method directly to see the sort key. Example, again continuing from above:
for cn in sorted(cn_objs):
print cn.for_sort()
yields:
e!0000000748!v!t!0000000002!w!0000000005
e!0011!c!691!!0000000001!0000000002
e!0011!c!691!!0000000008!0000000011!!0000000009!0000000010
e!0011!c!691!!0000000008!0000000011!!0000000011
e!0725.45!!0000000001!st!!w!0000000035!!0000001998
e!0725.45!!0000000010!th!!u!0000000053!!0000001993
e!0806!!h!0000000067!a
E.g., in this case, for Local call numbers, integers default to using a 10-digit zero-padded sort number, while LC classes are only 4 digits.
Thank you! This helped a lot. I happened to not be including the unittypes
when I was sorting the array of call numbers.
Thanks again!
Hello,
Is there a way to classify call numbers without cutters as LC, so I can access the classification function?
These all return Local instead of LC.
Thank you! Tom