salsadigitalauorg / merlin-framework

Merlin - migration framework
GNU General Public License v3.0
16 stars 3 forks source link

Allow URL list in a separate file #53

Closed stooit closed 5 years ago

stooit commented 5 years ago

Description At present URLs and general config are kept in a single file. This becomes a little difficult to manage on very large migrations (e.g those with 10s of thousands of URLs).

Proposed solution Allow for a URL list to be provided in a separate yml file(s) to split config and urls.

An example config may be:

---
domain: https://www.example.com

urls_file: /path/to/crawled-urls-page.yml

entity_type: basic_page
mappings:
  -
    field: alias
    type: alias
    ...

The current implementation should still work (e.g urls key provided direct in config) - but it should be optional (and overridden by values provided in urls_file)

Additional context Related to #34 (still relevant for all inclusive config files) - but you would use the merge_into feature to create aggregate URL lists in their standalone file

stooit commented 5 years ago

Fixed in #69