MI-DPLA / combine

Combine /kämˌbīn/ - Metadata Aggregator Platform
MIT License
26 stars 11 forks source link

xml2kvp: regex on value, copy to target field #232

Closed ghukill closed 6 years ago

ghukill commented 6 years ago

Add ability to copy values that match regex patterns to a new field.

e.g. the values foo123 and bar123 are encountered. If the regex pattern .*123.* is entered with a target field of baz, those values will be copied to baz as well.

Could be similar to field copying, with an optional filter to remove those values from original field.

ghukill commented 6 years ago

Implemented:

In [3]: XML2kvp.xml_to_kvp(XML2kvp.test_xml, copy_value_to_regex={'.*[0-9]+.*':'NUMBERS'}, remove_copied_value=True)
Out[3]: 
{'NUMBERS': ('42', '109', '9393943', '3489234893'),
 'root_beat_@type=3/4': 'waltz',
 'root_beat_@type=4/4': 'four on the floor',
 'root_goober_@scrog=true_@tonk=false_depths_plunder': 'Willy Wonka',
 'root_internet|url_@url=http://example.com': 'see my url',
 'root_nested_attribs_@type=first_another_@type=second': 'paydirt',
 'root_tronic': 'You may disregard',
 'root_tronic_@type=tonguetwister': ('Sally sells seashells by the seashore.',
  'Red leather, yellow leather.')}