hathitrust / hathifiles

Generation of Hathfiles
0 stars 0 forks source link

TTO-207 investigate discrepancy in hathifiles full vs. incremental #42

Closed moseshll closed 2 months ago

moseshll commented 2 months ago

cutoff variable used in generate_hathifile.rb is currently excluding HTIDs that arguably should be treated as modified due to changes in the host catalog record. This patch gets rid of that variable and treats all HTIDs on the record as having been affected by the record change (thus including them in the resulting hathifile) regardless of their 974(d) update date.

Tests should be refactored at some point but I consider it out of scope for a TTO issue.

Includes a completely unrelated added test for update_hathifile_listing.rb (it "removes existing files that are too old" do ...) since Coveralls got irked at the slight decrease in test coverage due to the code deletion that is the main focus of this PR. Are you happy now, Coveralls??

coveralls commented 2 months ago

Coverage Status

coverage: 100.0% (+0.2%) from 99.791% when pulling 42eda5954d1a0fa1b85fbd9a09b95a6eacee3859 on TTO-207_remove_cutoff into 756c6e41dced371770f35fb9fb476890220f7f33 on main.

moseshll commented 2 months ago

Refactoring issue added as suggested. Along with the all-important "DRY" badge for future issues!