pulibrary / bibdata

Local API for retrieving bibliographic and other useful data from Alma (Ruby 3.2.0, Rails 7.1.3.4)
BSD 2-Clause "Simplified" License
16 stars 7 forks source link

Full reindex of SCSB looks to old pathname #2287

Closed christinach closed 9 months ago

christinach commented 10 months ago

Issue

For old files that exist in/data/bibdata_files/scsb_update_files the index is trying to index them from /data/marc_liberation/scsb_update_files which is the old pathname.

Impact:

Sidekiq will keep retrying jobs with a path that does not exist until they get deleted by sidekiq or an admin user.

christinach commented 10 months ago
kevinreiss commented 10 months ago

Christina is triggering another import today.

christinach commented 10 months ago
  1. The reason we see the old path is because we have very old events in the database with dump_files that have that path. We should delete them from the database. I will proceed with deleting them.
    id: 2057,                                                                                        
    dump_id: 56,                                                                                     
    path: "/data/marc_liberation_files/scsb_update_files/scsb_update_20210813_072300_1.xml.gz",      
    md5: "59a2bcd37b6c8da52accdac78342467e",                                                         
    created_at: Wed, 18 Aug 2021 19:46:57.145286000 UTC +00:00,                                      
    updated_at: Thu, 23 Mar 2023 20:21:00.157730000 UTC +00:00,                                      
    dump_file_type_id: 3,                                                                            
    index_status: "enqueued">
  2. next step: See why when we run the full scsb reindex it's indexing old scsb update files. It should index the latest full partner scsb dump file and the rest of the daily scsb dump files that were created after the scsb full dump.
christinach commented 9 months ago

following ticket to look into https://github.com/pulibrary/bibdata/issues/2294