Focus of this PR is the improvement of quality of records accepted by NightShift.
We run the app locally and processed a backlog of OverDrive MarcExpress records in BPL catalog (90k records). This allowed us to review results and make adjustments. I constrained queries a bit to reject records we can be fairly certain to be of poor quality. Additionally, after full, matching MARC XML records are obtained, the program runs a check if they meet certain minimum criteria. Sometimes existing MarcExpress records are simply better what can be found in Worldcat.
Details:
added two additional tables RottenApple & RottenAppleResource to store info on organizations that provide notoriously bad records in Worldcat
queries changes:
exclusion of results with encoding level 3 (MARC leader pos. 17) from all queries
exclusion of results with encoding level M for e-video
exclusion of records contributed by two particular orgs
refactored methods in tasks.py to a class tasks.Tasks as recommended earlier
increased size of local log files
made small changes to Worldcat records manipulation
Focus of this PR is the improvement of quality of records accepted by NightShift. We run the app locally and processed a backlog of OverDrive MarcExpress records in BPL catalog (90k records). This allowed us to review results and make adjustments. I constrained queries a bit to reject records we can be fairly certain to be of poor quality. Additionally, after full, matching MARC XML records are obtained, the program runs a check if they meet certain minimum criteria. Sometimes existing MarcExpress records are simply better what can be found in Worldcat.
Details:
RottenApple
&RottenAppleResource
to store info on organizations that provide notoriously bad records in Worldcattasks.py
to a classtasks.Tasks
as recommended earlier