Closed billsacks closed 6 months ago
For privacy reasons we would want an opt-out option (and this may be required legally, e.g., in the EU). The way I envision this working is as follows, but others may have different / better ideas:
You should get some legal advice but from what I can tell, the GDPR requires an "opt-in" approach before data that can be tied to an individyal can be collected. There are "legitimate interest" exceptions but even then, you must provide a mechanism for a challenge. Then, if the data is to be transferred to the US, it has to meet the requirements of the DPF. IANAL but you should consult one.
We don't plan on collecting 'personally identifying' info, and the data gets anonymized, which seemed to be fine to UCAR legal at a first glance (so long as the CESM webpage has a notice about it). But this is a good reminder to go back to them with a more detailed write-up and ensure we're fine.
Cheers,
On Sun, Dec 10, 2023 at 12:23 PM goldy @.***> wrote:
For privacy reasons we would want an opt-out option (and this may be required legally, e.g., in the EU). The way I envision this working is as follows, but others may have different / better ideas:
You should get some legal advice but from what I can tell, the GDPR https://en.wikipedia.org/wiki/General_Data_Protection_Regulation requires an "opt-in" approach before data that can be tied to an individyal can be collected. There are "legitimate interest" exceptions but even then, you must provide a mechanism for a challenge. Then, if the data is to be transferred to the US, it has to meet the requirements of the DPF https://en.wikipedia.org/wiki/EU%E2%80%93US_Data_Privacy_Framework. IANAL but you should consult one.
— Reply to this email directly, view it on GitHub https://github.com/ESMCI/cime/issues/4541#issuecomment-1849057525, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACL2HPIBCQ57O4OLKAM56RDYIYDZNAVCNFSM6AAAAABANMK2RWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBZGA2TONJSGU . You are receiving this because you were mentioned.Message ID: @.***>
Thanks @gold2718 . Yeah, we will definitely want to get a sign-off on the data collection approach. I have renamed the issue to "opt-in/out"... the way I have envisioned this (and tried to describe it) actually feels more like an opt-in approach already: not collecting any data unless the user gives permission when this message appears.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
I have been talking with @briandobbins @jedwards4b and others this week about building out our capabilities to collect statistics on CESM usage. I am opening this issue to record some initial ideas and start a discussion on how we might want to implement this.
The basic idea is that we want a feature – that users could opt into or out of – that sends some information back to us to let us collect data on how many people are using CESM and – at least eventually – details of the kinds of configurations they are running. This could be done in create_newcase or case.submit; the latter could be better in terms of allowing us to collect information on what's actually being run. The initial implementation could collect minimal information (even a simple ping where we can record the IP address would be useful information, so we can get an overall sense of usage); we could then build this out over time.
For privacy reasons we would want an opt-out option (and this may be required legally, e.g., in the EU). The way I envision this working is as follows, but others may have different / better ideas:
.cime
directory?). This would allow bypassing this message in the future.--collect-statistics
or--no-collect-statistics
, which would populate an xml variable likeCOLLECT_STATISTICS
for the given case; this would override whatever default the user set previously.Depending on what statistics we want to capture, it might also be helpful for (hidden) files to be created in SRCROOT and/or CASEROOT the first time this is run: By putting a file in SRCROOT, we can see if the given clone has already been counted: if this file is not yet present, then we haven't counted this clone yet. Similarly, by putting a file in CASEROOT, we can see if the given case has already been counted. (This would facilitate counting the number of times CESM/E3SM is cloned, and the number of unique cases created.)
I'll note that, for simply collecting data on the number of clones of CESM, we considered implementing something in manage_externals (e.g., hitting an NCAR server whenever manage_externals is run so that we can collect the kind of data that used to be collected when we hosted CESM via svn), but since the long-term goal is to collect additional statistics, we thought we might as well put this functionality in the right long-term place, which is CIME.
Thoughts?