Open johnvanbreda opened 8 years ago
Crikey, I thought Indicia was part of a drive towards open access to biological records. Where has this requirement come from?
It still is. This requirement is to give greater clarify (and freedom) by aligning with Creative Commons - with a default of CC (or CC-BY). Currently, it's not clear that we can supply images to others for example (e.g. to Atlas of Living x,y,z).
Some comments from me:
Response to questions:
"Can we set a default license, with no warning. This is equivalent to agreeing to T&Cs?"
Licenses for surveys - don't forget that the user's default license would apply. In most cases I think it is the recorders' prerogative to choose the license (especially if we are only allowing CC variants). It's probably only in the commercial world where you are being paid by an organisation that the organisation has some right to select the appropriate license.
I've added an extra question to the list about the meaning of the term commercial with respect to LRCs.
If we use an "original license applies" approach, then we get an inconsistency. 2 scenarios:
These 2 scenarios which are only differentiated by timing result in the records having a different license.
"Can we set a default license, with no warning. This is equivalent to agreeing to T&Cs?" I didn't read #8 closely enough! I meant for new registrations only but I see that you refer to 'login'. I agree about not forcing a change on recorders' existing data.
Licenses for surveys - good point. In that case, no need for Survey level?
Why does it need all this complexity to clear up what you say is a lack of clarity? Why not just improve the wording of the terms and conditions?
How would we deal with the legacy issue/manage a change in T&Cs?
Jim, there is an expectation that licensing will follow open standards - this has to be a good thing since knowing that a record is CC or CC-BY for example is far easier to communicate than a page of T&Cs, which might be different per recording website. At the moment there is no association between the T&Cs on iRecord and the resulting records on the NBN Gateway.
We could simply declare all records to be a particular variant of CC, state this in the T&Cs, then attach this to all records going to the Gateway, but this is not what many users have come to expect, having had the option to use a choice of CC variants on sites like Flickr for example.
I think in general, for nearly all Indicia projects, the scenario is fairly simple.
The complexity comes in for systems that manage commercial projects, i.e. the Consultants Portal, where the project manager might dictate the license for a project to meet client needs.
David - a change in T&Cs might not affect the license - in which case this is an existing issue but there are Drupal tools for asking people to reconfirm acceptance if this does come up. If it does affect the license and we need to force a license change on existing records, then I would argue that this is a very unusual circumstance and one which will need to be communicated to the recorders and would probably result in some manual queries as well as broadcasting of the change to other data users.
Other options to consider.
I could go on with more but my point is to suggest there are simpler possibilities.
I agree there are simpler ways to attach a CC license to the records, however any approach we take here should be sensitive to the wishes of both recorders and record centres, so a modicum of flexibility and choice may be hard to avoid. As the copyright holder of the record it seems logical that the recorder should be the primary decision maker with respect to licensing.
But of course, I'm just the developer putting the other side of the argument - simple is always good, and it's up to David and Ella to decide which way to proceed.
Ah well, you have offered what has been asked for then. Seems a lot of work to overcome some out of date T&Cs and a perceived need for individual choice, especially if the only choices are CC-BY and CC-BY-NC. (I'm not sure there is a CC without attribution.)
The recorder is always the primary decision maker, even if the licence is chosen for them, because they can always choose not to submit a record.
Jim is right to push for simplicity, but I can't see an alternative if we want to change the license to NC-BY for legacy data (desirable) and give users an opt-out (essential). But if the development costs are prohibitive, we'll have to think again.
I think one of my alternatives would have achieved what your are asking for in that comment, David. Obviously your brevity has understated the requirement. I'll shut up and get on with what I am supposed to be doing.
Please don't (shut up) - it never hurts to thrash things out from another perspective.
BUT, I don't think that the difference in complexity between the 2 solutions is that huge. Ignoring the Consultants Portal specific items, here are the list of requirements again with the alternative approach described alongside:
So it's only 1, 2 and 13 that are different. 1&2 are fairly trivial (adding up to no more than 2-3 hours effort for the difference) and 13 is a would-be-nice anyway. For the full version of this, excluding Consultants Portal specifics, I think this is approx 1.5 days work.
Okay, back again. I am thinking about it from the perspective of how the database has to change along with corresponding queries. Also what the impact is on the client websites and the user.
I'm using your premise that the licence is the users choice rather than a survey or website.
Additionally, when submitting new records, the currently selected licence has to be looked up and stored against the occurrence. Simpler. No change.
In both cases the licence may be a field worth adding to cache_occurrences. This would be a licence_id requiring an additional join when reporting. Simpler. A boolean with no join.
I'm not quite sure I understand. I think you are saying that a simpler version of my proposal is to allow the user to choose but that their choice would apply to all their records on the warehouse, not on a per-website basis. If so, then apart from the move of the license field from the users table to the users_websites table, I don't see this as a significant change in terms of complexity. In response to your points above:
I really don't see this as a huge change in complexity and it would be a shame not to do it right.
You understood my meaning correctly. Your proposal is that a website should first select a list of licences to offer and then a user is free to keep switching back and between licences on a per-website basis.
By way of illustrating the opposite end of the spectrum, my suggestion is that there is a single licence offered and it is solely the users choice whether to accept it and that it is a one-way decision (optionally including legacy records).
Given you think the more complex version is only a day and a half to implement that doesn't sound bad at all. Just because the warehouse supports all manner of possibility doesn't mean it has to be employed.
Does your estimate include changes to the client side as well?
For point 2, this requirement came from Ella Vogel at the NBN, not David. For point 4, websites will only come on board when they choose to do so. They will need some new code and perhaps configuration (though we could default this). However I don't think its going to be considered well-mannered to change the license of records on a website without some change to that website to make it clear to the user, so I think having to change each website is inevitable.
My estimate did include changes to the client side, though I'd only do Drupal 7 (perhaps covering D8 from different funding, and ignoring D6).
Hi all, thanks for the interesting conversation. Apologies for my lack of contribution until now.
It feels important to go back to the very first comment. It seems that John has very clearly captured what needs to be done in order to assign licences to all records and I think it is important that we don’t lose sight of this.
Jim, for way of a little background to the need to assign data licenses...
Over the last 12 months the NBN Secretariat has undertaken a review of data licensing on the NBN Gateway. This was done at a number of workshops and through questionnaires completed by NBN members and Data Partners. The focus of this has been to improve options for data sharing and facilitate increased data use. As David said, having a clear suite of licenses ultimately opens the use of data, as users can clearly identify which data they can use and which they need to avoid or seek express permission to use. The current Terms and Conditions on the Gateway, with ability to add additional constraints as set by the data provider, make it nearly impossible for someone to use an aggregation of datasets as each may have a different ‘bespoke’ license.
One of the main requests from NBN Data Partners has been the ability to assign a data license to their own datasets. The NBN Gateway has now been changed to allow this and four data licenses are available. Data partners can now give their datasets a license via the dataset metadata page on the NBN Gateway. You can read more about this stream of work here: http://www.nbn.org.uk/News/Latest-news/Data-Licensing-on-the-NBN-Gateway.aspx#sthash.aniv5UBG.dpuf
The data license options currently available on the NBN Gateway are Open Government License (OGL), CC0, CC-BY, CC-BY-NC. If there is a requirement to add further licenses we can look into this in due course. These licenses will also be available, as John said, on the Atlas platforms.
For the two questions posed by John: 1) If a project admin sets the license on an existing project, then what happens to existing records that are already licensed by the recorders? The licence should stay as it was originally assigned. I don’t think we should be retrospectively changing licences.
2) Do we need to define where a record centre sits in respect to the term "commercial"? I could imagine wanting a CC-BY-NC (*but allowing access to record centres) license. Yes this is something we will be looking over the next few weeks. To date there has been no objection to the four chosen data licenses on the NBN Gateway and the license particulars are all available for users to read and give due consideration to before assigning a given license. However, perhaps we do need to develop a LERC license – I will discuss this further with the rest of the NBN Secretariat.
Are there any main points here that I have missed and need addressing? I’m conscious that we still need to discuss resourcing this as it sounds like there are elements that reach further than the Consultants Portal.
Hi Ella,
Thanks for the info. My initial response was one of surprise because it appeared in the issue queue without me being aware of the background. John has convinced me that his solution is reasonable because it allows full flexibility with very little development effort.
However, I am not yet clear that this full flexibility, putting the licence choice in the recorders' hands and allowing them to keep changing it, is necessary or desirable.
Ok, so I think we are now a bit clearer where we are coming from. Trying to summarise a bit - from the perspective of iRecord, I think Jim's solution could meet the NBN requirements. We could have a single flag in the user profile which the user can tick to "accept CC-BY" licensing. Once ticked it cannot be unticked. All reports that need to grab the license state of a record would join to the users table (or possibly users_websites if it is done on a per-website basis). We could do this in the cache building to minimise the effect on reporting performance. Therefore the iRecord dataset on the Gateway could be split into a dataset with the current unclear license, plus a second dataset with CC BY licensing. Positives of this approach are its simplicity. Negatives are:
The alternative approach is to provide a configuration table in the warehouse which lists available licenses (OGL, variants of CC etc). The user can select a license and store it in their profile. This license is then assigned to all records that they enter going forward (and optionally all prior unlicensed records). They can change license at any point but only for records going forward.
The negatives of this approach are:
Positives of this approach are:
My opinion is that this requirement is all about opening records up. Therefore if we feel that by offering a wider choice to users than a single license option we'll get more open records, then we need to factor in choice.
Hi John. As I said, you've won me round to your way of implementing it on the warehouse since you could do it in less time than we have been talking about it.
It would then be down to different websites to decide on the offering they would make to their users with, I think, every permutation being a possibility from the user changing their licence every day to the user making a one-off choice to accept a single licence.
I'm happy to go with John's suggestion as the extra complexity is required. Thanks
From a Consultants Portal point of view, I am happy that we have decided that an option of licenses is the best way forward.
The infrastructure will then be there for other platforms, such as iRecord, to go down the same route if they wish, or to provide a simpler one-option license.
Thanks for everyone's input.
I am just reading the Atlas of Living Scotland Terms of Use (http://www1.als.scot/terms-of-use/). It says "Note in some cases Content may be in the public domain, in the sense that it is not subject to copyright protection because it does not qualify for copyright, eg individual species sightings"
That explains why, on their recording form, the licence selection only refers to uploaded images.
Do we disagree with this statement?
Thanks for raising this Jim. For the ALA there’s a licence associated with the dataset of records (Creative Commons Attribution 3.0).
http://collections.ala.org.au/public/show/dr364
We haven’t specified this in Atlas of Living Scotland yet but will be making it clearer on the page that records submitted through this route will be assigned such a license.
Each individual image associated with a record do have a separate licence.
Any more feedback on the site would be appreciated.
Having a licence on a dataset is consistent with what is being done on the NBN gateway. http://nbn.org.uk/News/Latest-news/Data-Licensing-on-the-NBN-Gateway.aspx)
I can see the proposal put forward here for Indicia, of allowing users to choose licences per record (assuming this is valid) could come in to conflict with licensing at the dataset level.
Say I submit a ladybird record to iRecord with the Ladybird mobile app. In the new world of user selected licensing, I pick CC_BY_NC. In the meantime, let us imagine that the Ladybird Recording Scheme has taken up the new licensing options offered by the NBN and is applying CC0 to its dataset.
This effectively prohibits my record from being added to the Ladybird Recording Scheme dataset because the licence I have chosen is more restrictive than the licence chosen by the dataset administrator.
The undesirable consequences are that either my record cannot be added to the LRS data set or the LRS administrator opts for the most restrictive licence to allow inclusion of all available records.
Would appreciate thoughts from David and Ella before I continue on this development (which I've already started). As Jim says, having a licence selectable by the user seems at first glance to be incompatible with the dataset level licensing of the Gateway and ALA. Some possibilities:
The first option here is more or less exactly the same as the proposed development but where the dataset administrator only enables a single licence option. Therefore we could complete the current development without losing this possibility. However although this option matches the expectation of licencing on the ALA, it does possibly mean that some records will be lost if the recorder disagrees with the dataset administrator's choice.
Thanks for raising this as an issue. I think that as far as the Consultants Portal goes, the licence is set at project level by whoever sets up the project. A large majority of the cases on the Consultants Portal will be that people contributing records to one project will all be from the same consultancy, so will be abiding by the rules set by their organisation.
If there is an instance where the user disagrees with the licence set for the project, I would assume that that is something that would be worked out by the consultancy and their consultants. When ad hoc records are added, the user can select a licence for the specific record, but will have to re-select each time they add individual records.
I therefore think that the best course of action here is to, as has been suggested, not give a licence option when adding to a project already set up. All records added to a project will agree with the licence that is assigned to the project. For all instances of records being added not under a project, the user will chose their own licence.
We have not got a big budget to be spending on these developments, and it is not our intention to be developing the whole system here, but merely developing the Consultants Portal so that it can manage data licences and can subsequently allow these licences to stay with the record through to the NBN Gateway. Hopefully, what is being developed here can act as a basis for developing the iRecord system and other systems when time and budgets allow, but, unfortunately, for now we have to focus only on what will benefit the Consultants Portal.
I hope that I have interpreted this issue correctly and have provided a useful response.
John. Given Ella's sensible re-iteration of the background to this, I suggest you implement the solution for the Consultant's portal.
We can then review how we implement this for iRecord once budget become available. One option is:
I have now implemented enough for this to be used on the Constultants Portal. From the original requirements list, this covers:
I also have code (not yet commited) for adding a licence selection control to the user profile editing form.
Closing as now implemented.
Currently, Indicia stores no information about the license associated with a record. It is implicit in the survey dataset how the data is to be managed and licensed.
Requirements:
Questions: