Open twagoo opened 7 years ago
Sounds great to mee. The only problem I see is how to get the correct information for "Can I use it for research"? Practically no licenses include "research only" provision. There are some exceptions (small custom licenses), but I am not aware of a popular public license (like CC) that would have "research" restriction. There is however a Non Commercial restriction in CC, and a few others. So maybe it would make sense to include those somehow?
It is not trivial, though. So I am thinking about an option to include some tool that will help you create this licensing restriction by guiding you through it and explaining.
Something like https://ufal.github.io/public-license-selector/ in principle, but much simpler, because here the user doesn't need to choose a license directly.
"Can I use it for research" obviously is a superset of "Can I use it for anything", and both would (as far as I can see) include all CC variants. I think all records marked with availability 'ACA' should go in the former but not in the latter.
@stranak wrote
So I am thinking about an option to include some tool that will help you create this licensing restriction by guiding you through it and explaining.
For the metadata creator or end user? Users don't need that, they will have to do (in my proposal) with these 2 levels of restricting their search results. The assumption is that this is all any of our users will ever need for search. Then when they find a record that matches their interests content wise, they can inspect the licence and other usage conditions.
For the user. What I am trying to say is those 2 levels can't work in my opinion. There is no way you can tell a user "yes, you can use this for research" without knowing anything about their use case ("research" is NOT enough information) and without telling them the "but" part. I.e. They do need to understand more about the license of the data they want to use. One possibility is to ask them up front things like:
"for anything" in your view would be only data in CC0 or Public Domain. Everything else has some restrictions. "ACA" also doesn't mean you can use it for any research at all. maybe if we say non-commercial research. But either way I would add some clear disclaimer that this "selector" doesn't mean there are no other restriction and users have to read licenses of the data.
Simply put, "Can I use it" is a difficult question and I would be reluctant to say YES/NO without having more information.
@stranak thanks for the input - and sorry for my delayed response.
I completely agree with you that we need to communicate this kind information to this user. However, my proposal is to do this only at the record level. In the (initial) search stage, 'Can I use it for research' should mean (and may need to be rephrased to reflects this): "Can it be used for any kind of research?" - I know, this is very broad but that's how faceted search should work IMO, you narrow down step by step. As a next step, we could optionally provide a set of more detailed selection criteria that allow for distinction between commercial/non-commercial research etc. But then you already get into the muddy territory you are describing. Assuming there is no such step for now, a user would typically look for resources that are of potential interest content-wise, and then for that selection of records look at the conditions of (re)use we list in the record details page and/or the information at the providing repository.
What would be easy to implement as an approximation of the second step that could already be useful to some users is a 'licence' facet that could list among other things all the CC variants so that licence-savy users can tune the results down to those licences in a way they know suits their demands.
"Can I access it" can be partially answered by automatic url checking which is available as a part of the curation module. A separate issue should be created for integrating this information.
"Can I access it" can be partially answered by automatic url checking which is available as a part of the curation module. A separate issue should be created for integrating this information.
The information is available and integrated for display purposes (but not yet as a filter option) in VLO 4.7 (see #220, #241)
Recently there have been some discussions about the licence categories (also interpreted as 'availability levels'). There is no sign this discussion will be settled soon or ever at all. We should investigate whether we can deal with these matters in a more user friendly way for the VLO. One approach to consider is having two facets that answer two separate questions:
directly online
;online after authenticating
;on request
; records that do not fit any of these boxes do not need coverage in this facet IMO.for anything
;for research
; (for commercial
) I don't think there are use cases for filtering for other use cases, unless we also want to allow users to filter for commercial usage; records that do not fit any of these boxes do not need coverage in this facet IMO.We can map licences, licence categories and other usage conditions, right statements etc to the values from both vocabularies using the existing (cross) facet mapping facilities.