Open ahafele opened 7 months ago
select * from sul_mod_source_record_storage.records_lb where external_id = '11bf23c4-82a8-4466-96cf-993a81310a3c';
returns 0 results. I then looked in MetaDB for the instance ID in the folio_source_record marc__t table and it retrieved 0 results. I don't think this one has a source record. Oh yea, it's in my list of instances missing SRS on ticket https://github.com/sul-dlss/libsys-airflow/issues/838 .
As for the skipping in the log, records 356 then 359, what happened to 357 and 358? https://folio.stanford.edu/data-import/job-summary/f5a82d0b-ebe3-435f-8dae-947ebc5c882b Is this maybe where a duplicate record was in the input file? 🤷
No duplicates in the file from what I can tell.
Jeanette and Yael have reported some discrepancies in data import loads - the numbers in the logs are not consistent and sometimes records error but on subsequent loads those same records will load just fine. Some records are skipped all together but you only know this by scanning the data import log and on subsequent loads that same record will load fine. I'm including here information about one particular file in hopes maybe @shelleydoljack could look at the logs for some clues. Other problem loads are being tracked here.
Springer file troubleshooting 811 records in file Profile should create new or update if match is found
Production loads (Nolana) Initial load https://folio.stanford.edu/data-import/job-summary/51fe7eb9-1536-4501-b775-de1ddae6864f
Resent from VMA to prod Main log - 500 389 records found https://folio.stanford.edu/data-import/job-summary/5796cbe6-aef0-4a76-97b9-9933a2123270 Main log - 311 264 records found https://folio.stanford.edu/data-import/job-summary/4477b856-9c7c-4faf-badd-84f18d53c169 653 total
Resent direct to data import Main log - 811 665 records found scanning the log 356 jumps to 359 indicating that 357 and 358 have been skipped any clues in the backend logs as to why? 357 loaded fine in previous load
Errors are showing as
io.vertx.core.impl.NoStackTraceThrowable: Timeout
in the DI log but based on a slack convo I saw I checked to see if an error/discarded record was actually updated and it was - #292 Practice and theory of automated timetabling V… https://folio.stanford.edu/data-import/job-summary/f5a82d0b-ebe3-435f-8dae-947ebc5c882b?errorsOnly=true and the instance was last updated at 12:48 pm which corresponds to time of this load. When we tried to view source the record "broke" and is now disconnected from the MARC. This corresponds to what I saw in this slack thread from 2 years ago.Test loads (Poppy) Initial load https://folio-test.stanford.edu/data-import/job-summary/39ced836-cbbd-41a9-b6d1-a15be0f2856d 811 records found Resent no issues - https://folio-test.stanford.edu/data-import/job-summary/adf6e4d5-7e5a-4788-9ff7-c2d4bd75ba50 811 records found