Closed lydiam closed 3 months ago
On Wed, Jan 21, 2015 at 10:31 AM, Lydia Motyka notifications@github.com wrote:
So I believe that the answer is "yes" - delete the longer structMap that refers to PDFs.
At this point I don't believe that it's worth modifying the program to take into consideration the file file format referenced in the structMap when evaluating structMaps.
I do tests (looking at 'use=' reference, index, archive) and mimetype (image is more heavily weighted than text) but I first use length - really, number of files referenced.
So it's easy to re-arrange this, but I'd rather you put it on as a github issue, referencing this example package (if you can attach the METS file, that would be ideal).
-Randy
On Wed, Jan 21, 2015 at 10:31 AM, Lydia Motyka notifications@github.com wrote:
So I believe that the answer is "yes" - delete the longer structMap that refers to PDFs.
At this point I don't believe that it's worth modifying the program to take into consideration the file file format referenced in the structMap when evaluating structMaps.
I do tests (looking at 'use=' reference, index, archive) and mimetype (image is more heavily weighted than text) but I first use length - really, number of files referenced.
So it's easy to re-arrange this, but I'd rather you put it on as a github issue, referencing this example package (if you can attach the METS file, that would be ideal).
-Randy
I can’t figure out a way to attach files to GitHub. Any suggestions would be appreciated.
From: Randy Fischer [mailto:notifications@github.com] Sent: Wednesday, January 21, 2015 12:25 PM To: FLVC/offline-ingest Cc: Lydia Motyka Subject: Re: [offline-ingest] package program discards shorter JPG structMap in favor of longer PDF structMap (#20)
On Wed, Jan 21, 2015 at 10:31 AM, Lydia Motyka notifications@github.com<mailto:notifications@github.com> wrote:
So I believe that the answer is "yes" - delete the longer structMap that refers to PDFs.
At this point I don't believe that it's worth modifying the program to take into consideration the file file format referenced in the structMap when evaluating structMaps.
I do tests (looking at 'use=' reference, index, archive) and mimetype (image is more heavily weighted than text) but I first use length - really, number of files referenced.
So it's easy to re-arrange this, but I'd rather you put it on as a github issue, referencing this example package (if you can attach the METS file, that would be ideal).
-Randy
— Reply to this email directly or view it on GitHubhttps://github.com/FLVC/offline-ingest/issues/20#issuecomment-70880971.
This is a GitHub issue – do you want another one?
From: Randy Fischer [mailto:notifications@github.com] Sent: Wednesday, January 21, 2015 12:24 PM To: FLVC/offline-ingest Cc: Lydia Motyka Subject: Re: [offline-ingest] package program discards shorter JPG structMap in favor of longer PDF structMap (#20)
On Wed, Jan 21, 2015 at 10:31 AM, Lydia Motyka notifications@github.com<mailto:notifications@github.com> wrote:
So I believe that the answer is "yes" - delete the longer structMap that refers to PDFs.
At this point I don't believe that it's worth modifying the program to take into consideration the file file format referenced in the structMap when evaluating structMaps.
I do tests (looking at 'use=' reference, index, archive) and mimetype (image is more heavily weighted than text) but I first use length - really, number of files referenced.
So it's easy to re-arrange this, but I'd rather you put it on as a github issue, referencing this example package (if you can attach the METS file, that would be ideal).
-Randy
— Reply to this email directly or view it on GitHubhttps://github.com/FLVC/offline-ingest/issues/20#issuecomment-70880827.
See the mets.xml file from FIU, /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102/
I'm not sure that this is worth modifying in code, but I think it's worth noting.
/ssa/d2i/FIU_FEOL_books_dumpB/FI06050102, when test-loaded, gives the following messages:
[lydiam@tlhlxftp01-prd FI06050102]$ package --test --server fiu7prod /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102 Processing 1 package: /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102 Invalid package in /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102. 0.00 sec, 0.00 MB BookPackage::FI06050102 (no pid) => collection: fiu:feol, palmm:feol, "Fire Careers, Adventures For Your Life!" Errors: The Book package FI06050102 is missing the following 1 required file declared in the mets.xml file: Exception TypeError - can't convert nil into String for Book package FI06050102, backtrace follows: /usr/local/islandora/offline-ingest/lib/offin/packages.rb:964:in
+' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:964:in
reconcile_file_lists' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:964:inmap' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:964:in
reconcile_file_lists' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:844:ininitialize' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:46:in
new' /usr/local/islandora/offline-ingest/lib/offin/packages.rb:46:innew_package' /usr/local/bin/package:52 /usr/local/bin/package:48:in
each' /usr/local/bin/package:48There are 2 structMaps in the mets, one that references 2 JPGs, and one that references 3 PDFs. I believe that the program is discarding the JPG structMap and then erroring out.
Is the solution to delete the PDF structMap?
I tried that and got the following result:
[lydiam@tlhlxftp01-prd FI06050102]$ package --test --server fiu7prod /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102 Processing 1 package: /ssa/d2i/FIU_FEOL_books_dumpB/FI06050102 0.19 sec, 0.00 MB BookPackage::FI06050102 (no pid) => collection: fiu:feol, palmm:feol, "Fire Careers, Adventures For Your Life!" Warnings: The Book package FI06050102 has the following 4 unexpected files that will not be processed:
So I believe that the answer is "yes" - delete the longer structMap that refers to PDFs.
At this point I don't believe that it's worth modifying the program to take into consideration the file file format referenced in the structMap when evaluating structMaps.
ON HOLD