CARLI / web-reports

Web Reports Web Based Reporting Tool
2 stars 0 forks source link

Report "Find 020's with multiple $a's" OC21 regular expression #103

Closed gibsonjc closed 5 years ago

gibsonjc commented 8 years ago

Find 020's with multiple $'s (Local Cat Maint > Bibliographic Records: Control Numbers)

gibsonjc initially repored: In KNXdb on devel, web-reports is returning zero results, but in Access this query returns 2924 results.

patrickzurek commented: I'm troubleshooting Find 020's with multiple $a's (Local Cat Maint > Bibliographic Records: Control Numbers) Jessica, Can you please post some matches for this query from Access or (preferably) email me the Access file if it's not too large (if needed you can right click on the file in macOS and click compress...). That would be a great help.

patrickzurek commented : This is about "Find 020's with multiple $a's (Local Cat Maint > Bibliographic Records: Control Numbers)."

Chris, did you craft the regexp yourself used in the query or did you get it from somewhere else? I'm no expert on regular expressions by any means but I think I see a few problems with it (for one, I don't think the use of anchors belongs?). Unfortunately, my attempt suffers from a problem too. With my mine I get 2,925 results compared to Jessica's 2,924. I used ([0-9]{9}).*{2,} I don't think this one is truly correct since it will only match on double ISBNs, not entries with 3 or more.

patrickzurek commented : Oops, I copy pasted the wrong regexp that I used. I meant to paste. '([0-9]{9}).*([0-9]{9})')

patrickzurek commented : I'm making typos left and right. In my last post I left an errant parenthesis on the right end of the string that shouldn't have been there.

Anyway, I think I may have come up with one that should match 2 or more ISBNs: ([0-9]{9}).([0-9]{9}).{1,}

It still returns the same amount for KNX: 2,925, but Jessica, could that be because there simply aren't any 020 fields that contain more than 2 ISBNs in KNXdb? I ran the query on UIU and found some results like:

8470900315 (set) 8470900323 (v.1) 847090048X (v.2) 3487041766 (v.1) 3487041774 (v.2) 3487041782 (v.3)

gibsonjc commented: Just sent you the KNXdb on devel Access output via email (Excel spreadsheet).

KNXdb does have bibs with more than two 020 $a's like this one that can be found in both the Access and the web-reports output: BIB ID 1689 0873952855 0873952863 (pbk.) 0873952871 (micro.)

patrickzurek commented: I was confused why my first regex that I posted would return bibs with more than two ISBNs when I expected it to match two and only two ISBNs. But I made a really dumb mistake in my regexp: ".*"

The error is obviously present in([0-9]{9}).([0-9]{9}).{1,} too.

I'll work on refining it further.

[This was split out from Issue #94 ]

gibsonjc commented 5 years ago

This works correctly as is. I tested on KNX, NCC, HRT, and UIU. Web Reports Devel found a few additional cases that the Access report did not find! This is Ready for Prod.

gibsonjc commented 5 years ago

On Prod.