nlesc-sigs / data-sig

Linked data, data & modeling SIG
Other
5 stars 3 forks source link

Making commercial data FAIR #14

Closed dafnevk closed 6 years ago

dafnevk commented 6 years ago

For the corporate networks project, together with @lbogaardt , we make use of a commercial database. We would like to make this data more FAIR.

The current situation is as follows: 1) The data used is bought from a commercial organization. I do not know the details of the contract, but only researchers at the UvA can make use of the raw data 2) The data is delivered as SQL dumps, with limited meta data (some meta data is available on their online interface to the database) 3) These SQL dumps are processed, modified and put in a MySQL database on a server at the UvA 4) The research group queries this database, and usually does further analysis on an aggregated dataset (e.g. figures per country or per city, instead of per firm).

A student-assistant soon start working on step 3) for new dumps, and we want to make a plan to do this in a way that makes the data more FAIR. This raises the following questions (and probably more):

c-martinez commented 6 years ago

Level's of data fairness can be found here.

c-martinez commented 6 years ago

On the topic of provenance, look at Metric R1.2

dafnevk commented 6 years ago

Additional question: How can we take care of the F part of fair when the data is owned by a commercial party?

PatrickAerts commented 6 years ago

I think the F can only be validly implemented through the Metadata. These should be in terms that one would use to search for data. Through the meta data one can ensure that someone can become aware of the very existence of the (data) source, even if that source itself cannot be accessed other than by taking extra steps (like signing an agreement, accepting a license and pay). I think for the F it does not matter much if the owner is commercial or not. All the more if you refer to the data that you actually were given by the owner. Best regards, Patrick

From: Dafne van Kuppevelt notifications@github.com Sent: donderdag 29 maart 2018 14:37 To: NLeSC/data-sig data-sig@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [NLeSC/data-sig] Making commercial data FAIR (#14)

Additional question: How can we take care of the F part of fair when the data is owned by a commercial party?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/NLeSC/data-sig/issues/14#issuecomment-377221583, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEVBG-nXDx96Jin-Iep2MtPJOm2a4vyQks5tjNVfgaJpZM4S3dQb.

c-martinez commented 6 years ago

I think Patrick is absolutely correct, for F metadata is enough.

Specifically thinking of FAIR metrics:

So no need to give access to the actual data, as long as you can point to a process to get this data (even if the process is "email this person, give him/her money, pick up data on a usb stick at their offices".

c-martinez commented 6 years ago

@dafnevk , @lbogaardt -- did this answer your question? Can we close this issue?

dafnevk commented 6 years ago

Yes, for now we have enough information!