clarin-eric / VLO

Virtual Language Observatory
GNU General Public License v3.0
14 stars 6 forks source link

Only one out of PUB/ACA/RES should be present in the availability field #38

Closed twagoo closed 7 years ago

twagoo commented 7 years ago

In some cases, mapping and normalisation can lead to multiple values out of the PUB/ACA/RES set to be present in the availability field for a single record. The post processor should reduce these (while leaving all other availability values!) to only the most restricive.

E.g. if we have [PUB, ACA, BY, NC], this should be transformed into [ACA, BY, NC].

See Trac #983 for a related case.

davoros commented 7 years ago

If record have more then one tag from PUB, ACA and RES the most restrictive one will be kept.

The solution is implemented in branch issue38 see: https://github.com/clarin-eric/VLO/commit/77d52de4fb8ecd5f033e7c4c5b945684457ce56f

stranak commented 7 years ago

I think on Slack somebody mentioned situation when a record has ACA in metadata and you derive something else from your own mapping of licenses and attributes. So let's assume the license would be CC-BY-NC-ND and you would decide that means RES (because it does by the definition of RES). Do you overwrite the value ACA assigned by the centre with your RES?

twagoo commented 7 years ago

@stranak I think we should ideally not override the level specified in the metadata (or, put differently, that the metadata should be able to override our default mapping).

But currently we would do, since we do not distinguish between the 'level' and other availability information (the laundry tags) in the right places. It would essentially take a new, separate field to do so. I think that would actually make sense but we need to think it through if we agree that we want to have this kind of logic in our workflow.