Closed MarcusBarnes closed 8 years ago
Initial work on a CSV Books toolchain. Input looks like:
csvbookstestinput/
├── book1
│ ├── page-01.tif
│ ├── page-02.tif
│ ├── page-03.tif
│ ├── page-04.tif
│ ├── page-05.tif
│ ├── page-06.tif
│ ├── page-07.tif
│ └── page-08.tif
└── book2
├── 1884-01-24-01.tif
├── 1884-01-24-02.tif
├── 1884-01-24-03.tif
├── 1884-01-24-04.tif
├── 1884-01-24-05.tif
├── 1884-01-24-06.tif
├── 1884-01-24-07.tif
└── 1884-01-24-08.tif
.ini file looks like:
; MIK configuration file for generating Islandora book ingest packages
; from a CSV metadata file and locally stored TIFFs.
[SYSTEM]
[CONFIG]
config_id = CSVBookssTest
last_updated_on = "2016-11-01"
last_update_by = "mjordan@example.com"
[FETCHER]
class = Csv
input_file = '/home/mark/Downloads/csvbookstest.csv'
temp_directory = "/tmp/csv_books_temp"
record_key = Identifier
[METADATA_PARSER]
class = mods\CsvToMods
mapping_csv_path = 'csv_books_test_mappings.csv'
temp_directory = "/tmp/csv_books_temp"
[FILE_GETTER]
class = CsvBooks
temp_directory = "/tmp/csv_books_temp"
input_directory = "/home/mark/Downloads/csvbookstestinput"
file_name_field = Directory
[WRITER]
class = CsvBooks
output_directory = "/tmp/csv_books_output"
metadata_filename = MODS.xml
datastreams[] = OBJ
datastreams[] = MODS
[MANIPULATORS]
metadatamanipulators[] = "FilterModsTopic|subject"
[LOGGING]
path_to_log = "/tmp/csv_books_output/mik.log"
path_to_manipultor_log = "/tmp/csv_books_output/mik_manipulator.log"
Output looks like:
csv_books_output/
├── B0001
│ ├── 1
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 2
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 3
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 4
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 5
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 6
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 7
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 8
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ └── MODS.xml
├── B0002
│ ├── 1
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 2
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 3
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 4
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 5
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 6
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 7
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ ├── 8
│ │ ├── MODS.xml
│ │ └── OBJ.tif
│ └── MODS.xml
├── mik.log
└── problem_records.log
@MarcusBarnes mind if we merge this into master fairly soon? I'd like to continue with work on #238, which depends on this branch. If you want to test I can send you some input data.
@mjordan Yes - please send me some input data so I can test. Thank you.
Zip file containing input data plus .ini is at https://vault.sfu.ca/index.php/s/ZXUoZN17cUENqiN. You'll need to run composer dump-autoload
. Thanks!
@mjordan Worked as expected. Please create a pull-request and then I'll merge. Thanks.
Great, thanks for testing - here's the PR: https://github.com/MarcusBarnes/mik/pull/278
I'll document the toolchain (our last remaining major one to complete!) and update README.md over the weekend.
Closed with pull-request https://github.com/MarcusBarnes/mik/pull/278 (commit https://github.com/MarcusBarnes/mik/commit/503033cff290e707921bf7e8c27f59ac12578ede). Thank you @mjordan for your contribution.
An MIK toolchain for migrating books (monographs): create a CSV book filegetter/writer.
The input that Islandora Book Batch expects is documented at https://github.com/Islandora/islandora_book_batch.
N.B: We may need to flatten hierarchical source books for importing into Islandora since Islandora's Book Solution Pack currently only supports flat books.