Closed twagoo closed 6 years ago
the feature is implemented in the development branch with some testcases. It the moment we can define a value for the origin facet. If there's a complete match (string comparison), one or many defined values are set to cross facets (which might be the same as the origin facet). In the long run it might be more convenient to replace the string comparison by the ability to use regular expressions. Important: if the field-value matches the defined value this value is not processed for the origin facet anymore (unless the facet is defined as cross facet).
I noticed by testing the importer with cross facet mapping enabled, that if you define a cross facet mapping for a facet (i.e. set an 'origin facet'), that facet is no longer processed itself. So for example if I define any mapping from a collection
value to an availability
value (see #46), the entire collection facet effectively disappears. I'm not sure if this is how we envisioned the behaviour, but in any case looking at it now don't think this is desirable. I think cross facet mapping should in principle leave the original value untouched, i.e. cross facet mapping should only cause additional values. For cases where the original value should not be visible, we should either use 'hidden facets' or explicitly define this to happen e.g. through a special attribute in the cross facet mapping definition .
is this a new requirement or are we going to discuss it? Because I understood CFM in a way that it should in any case ignore the setting for the origin-facet if the value matches the condition
Perhaps this was underspecified then. Is there a written version of the requirements for CFM somewhere?
Mapping paramter combinations - use cases (draft, PDF)
Update: there's an asciidoc version of this available on GitHub now: https://github.com/clarin-eric/VLO-mapping/blob/development/doc/ValueMapping.adoc
Potential extensions of the design/implementation to consider (see the document Potential for reimplementation of VLO post-processors as value mapping cases):
FYI I have added some logging to the processing of the mapping and its application during the actual import. An example of what this looks like in actual import logs at INFO
level:
2018-04-04 13:30:01,357 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#getValueMappings:41] - Parsing value mapping in file:/srv/VLO-mapping/value-maps/dist/master.xml
2018-04-04 13:30:01,423 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#getValueMappings:47] - Found 2 origin-facet nodes
2018-04-04 13:30:01,425 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processOriginFacet:70] - Processing origin-facet node with name='_componentProfile'
2018-04-04 13:30:01,426 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processValueMap:88] - -- Processing value-map node
2018-04-04 13:30:01,427 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processValueMap:105] - -- Found 50 target-value-set nodes
2018-04-04 13:30:01,451 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processOriginFacet:70] - Processing origin-facet node with name='resourceClass'
2018-04-04 13:30:01,451 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processValueMap:88] - -- Processing value-map node
2018-04-04 13:30:01,452 INFO [ Importer main] [eu.clarin.cmdi.vlo.importer.mapping.ValueMappingFactoryDOMImpl#processValueMap:105] - -- Found 115 target-value-set nodes
all other related newly added logging takes place at DEBUG
or TRACE
level, for example:
2018-04-04 14:25:15,490 DEBUG [Pool-1-worker-2] [CMDIParserVTDXML#processValueMapping:461] - Value mapping: applying mapping [_componentProfile: 'SourceScan'] -> [resourceClass: 'image'] (override existing: false)
Tests on alpha show that value mapping works as intended. Currently, the development
branch of clarin-eric/VLO-mapping has maps and scripts supporting this. This will soon be merged into beta, then master and hopefully also into the acdh-oeaw fork.
also see comments @ https://github.com/clarin-eric/VLO/commit/cd15ac153f213e44e0b9a951ae59c159209d6b5e#commitcomment-25076921