OCDX / OCDX-Specification

Specification to describe the minimum information standard for online community data. Guidelines for describing data about online communities.
11 stars 11 forks source link

Minimal and Maximal Sets for Privacy and Ethics #26

Open sgoggins opened 8 years ago

sgoggins commented 8 years ago

Working with @moduloone and @katieshilton on this part of the OCDX. Correspondence and document attached.

Hi all,

I've attached a document that I hope can serve as a starting point for the max-set for the privacyEthics() portion of the manifest.

Essentially, I'd like us to consider including two additional fields (highlighted in yellow in the document):

oversightProvenance(): A field that can contain IRB approval numbers. (Should be used in conjunction with the oversight() field). tosCompliance(): Which includes three sub-fields, complianceAssertion(), tosVersionInformation(), and tosArchive(). The idea behind this is to create a way for a researcher to indicate that, yes their collection was compliant at the time of data collection (or in a rare case, that no it was not, but we collected it anyways), provide info on what version of a TOS the data was collected under, and include a pointer to an archived copy of the TOS if available.

Cardinality info should be in the attachment.

Finally, while I think this a good start point for the max set, I'm hoping that we may have the oppurtunity to actually talk to some of the early OCDX manifest users (or manifest creators) in order to better understand whether or not these privacy and ethics fields are meeting their needs so we can revise as appropriate.

Thoughts?

AniKarenina commented 8 years ago

The only thing I can think that might need to be checked on this is whether it’s adequately broadly stated/defined to accommodate non-US oversight models. I suspect so, but worth verifying if it hasn’t been already.

tosArchive() seems like a highly rarified detail for anyone to be recording proactively, so the odds anyone can answer it may be extremely low. And with this kind of thing, the more you ask for, the less you get.

One of my concerns is that asking for things that people didn’t know to record and/or can’t reconstruct could frustrate/discourage participation. Instead of skipping an item or using “no assertion” they may decide not to do any of it at all, and making items optional doesn’t necessarily solve that problem.

Anyway, just $0.02 based on years of hearing people say that asking for too much is counterproductive when it comes to voluntary contributions of data and metadata!

On August 9, 2016 at 11:52:26 PM, Sean P. Goggins (notifications@github.com) wrote:

Working with @moduloone and @katieshilton on this part of the OCDX. Correspondence and document attached.

Hi all,

I've attached a document that I hope can serve as a starting point for the max-set for the privacyEthics() portion of the manifest.

Essentially, I'd like us to consider including two additional fields (highlighted in yellow in the document):

oversightProvenance(): A field that can contain IRB approval numbers. (Should be used in conjunction with the oversight() field). tosCompliance(): Which includes three sub-fields, complianceAssertion(), tosVersionInformation(), and tosArchive(). The idea behind this is to create a way for a researcher to indicate that, yes their collection was compliant at the time of data collection (or in a rare case, that no it was not, but we collected it anyways), provide info on what version of a TOS the data was collected under, and include a pointer to an archived copy of the TOS if available.

Cardinality info should be in the attachment.

Finally, while I think this a good start point for the max set, I'm hoping that we may have the oppurtunity to actually talk to some of the early OCDX manifest users (or manifest creators) in order to better understand whether or not these privacy and ethics fields are meeting their needs so we can revise as appropriate.

Thoughts?


You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/OCDX/OCDX-Specification/issues/26

moduloone commented 8 years ago

Hi Ani,

I think you raise some really good points. I think it's reasonable to drop tosArchive() particularly if you are concerned about frustrating participation, which I obviously do not want. I do think that it's still worth keeping in tosVersionInformation() so there is a way to provide the option to create a trail that basically says, "Yes, this was ok to pull/scrape at the time of collection because we used these rules."

The reason I think this is useful is because of the problems that have cropped up in the wake of a number of Twitter's TOS revisionings where datasets collected under one set of rules at time of collection are no longer seen as compliant.

Edit: And I'm looking into the non-US question.

AniKarenina commented 8 years ago

Generally agree on tosVersionInformation() aside from the part where there’s no standard for “versioning” TOS (that I’ve seen, but I may have missed it) and even if there were, not everyone would use it, so we need good examples to keep it from getting confusing. I would have no idea how to answer that question myself other than TOS-as-of-date-says-its-OK.

On August 11, 2016 at 4:13:43 PM, Nicholas P (notifications@github.com) wrote:

Hi Ani,

I think you raise some really good points. I think it's reasonable to drop tosArchive() particularly if you are concerned about frustrating participation, which I obviously do not want. I do think that it's still worth keeping in tosVersionInformation() so there is a way to provide the option to create a trail that basically says, "Yes, this was ok to pull/scrape at the time of collection because we used these rules."

The reason I think this is useful is because of the problems that have cropped up in the wake of a number of Twitter's TOS revisionings where datasets collected under one set of rules at time of collection are no longer seen as compliant.

You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/OCDX/OCDX-Specification/issues/26#issuecomment-239277813