BOINC / boinc

Open-source software for volunteer computing and grid computing.
https://boinc.berkeley.edu
GNU Lesser General Public License v3.0
2.03k stars 449 forks source link

shouldn't export credit stats if hosts hidden? #3766

Closed davidpanderson closed 1 year ago

davidpanderson commented 4 years ago

A user email:

If "Do you consent to exporting your data to BOINC statistics aggregation Web sites?" is ticked but "Should BOINC@TACC show your computers on its web site?" is not ticked you should only be exporting the user details. Host details for that user should only be exported in the stats if both boxes have been ticked.

Your own site credit statistic pages correctly do not show hosts for a user if they have not ticked the "Should BOINC@TACC show your computers on its web site?"

An example, other than mine, would be user 4606 who shows "Computers hidden" but I can see they have 3 hosts:

Host: 1654 Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz [Family 6 Model 60 Stepping 3] Linux LinuxMint - Linux Mint 19 Tara [4.15.0-54-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)

Host: 2379 Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz [Family 6 Model 42 Stepping 7] Microsoft Windows 10 - Professional x64 Edition, (10.00.18363.00)

Host: 3738 Intel(R) Celeron(R) CPU N2840 @ 2.16GHz [Family 6 Model 55 Stepping 8] Linux LinuxMint - Linux Mint 19.3 Tricia [5.3.0-46-generic|libc 2.27 (Ubuntu GLIBC 2.27-3ubuntu1)]

These details, along with others, for each host are being shown in the stats file on your web site despite a user requesting in their preferences that they should not be shown. "

Ageless93 commented 4 years ago

The option "Should BOINC@TACC show your computers on its web site?" is used to hide your computers from view via your account, it doesn't hide them from public view or from that user's view. Because otherwise you can't use these anonymous computers when validating their run task against the same task run on (an)other computer(s).

Perhaps it's better to rename this option to "Should {project} stop showing your to the public computers via your account?" or something alike that says what this option does.

RichardHaselgrove commented 4 years ago

No, it's not a wording problem: it's a real breach of privacy and hence a breach of the GDPR.

On the project's website, I can see user 4606 (initials HB), and his computers are hidden. I can see host 2379 (actually, that's the i7), but it's owner is anonymous. The BOINC web software breaks the connection between user and host when requested.

On BOINCstats, I can see that user HB owns an i7 and a N2840 (don't know what happened to the i3) - the connection between user and host has been restored. Since he's opted for privacy, that's illegal.

I can only assume at this stage that the 'userID' field is populated in the HOSTS stats export file, even when privacy is requested - and I assume that's the case for all projects using the BOINC server software. TACC seems to have very few users and hosts at this stage, so I should be able to download and examine the stats export files at home.

RichardHaselgrove commented 4 years ago

Yup, the public host record starts

<host>
  <id>2379</id>
  <userid>4606</userid>
  <total_credit>3369.5779</total_credit>
  <expavg_credit>2.9163</expavg_credit>
  <expavg_time>1590476409.1730</expavg_time>
  <p_vendor>GenuineIntel</p_vendor>
  <p_model>Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz [Family 6 Model 60 Stepping 3]</p_model>

I'll try an older established project, like Albert or SETI Beta, if they're small enough.

RichardHaselgrove commented 4 years ago

Going back to the issue title, the solution is NOT to suppress the export of credit stats - apart from anything else, that would deprive David of the FLOPs statistics that he uses in his NSF applications. The solution would be to blank (or omit) the <userid> field in the host export record when the privacy flag is set.

And check the other export tables for similar bugs.

Ageless93 commented 4 years ago

I still think that if we're offering to show or hide the computers on the website, we should either follow up on that and when chosen to hide that they're not visible for anyone - maybe including the user? Or rename the option to make clear that they aren't really hidden from view for everyone.

Yes, we should completely anonymize them, especially before storing them in publicly available stats files. But that's not what the option at this moment implies.

On Wed, 27 May 2020, 09:27 RichardHaselgrove notifications@github.com wrote:

Going back to the issue title, the solution is NOT to suppress the export of credit stats - apart from anything else, that would deprive David of the FLOPs statistics that he uses in his NSF applications. The solution would be to blank (or omit) the field in the host export record when the privacy flag is set.

And check the other export tables for similar bugs.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BOINC/boinc/issues/3766#issuecomment-634480828, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACS5WU3GAF5HR2XWDXJHFLLRTS6HDANCNFSM4NJ6T7SQ .

RichardHaselgrove commented 4 years ago

Einstein@Home already handles this properly - Gary Roberts' computers are (correctly) shown as hidden on BOINCstats.

https://www.boincstats.com/stats/5/user/detail/12521

So does SETI@Home - Mr. Kevvy's are hidden too.

https://www.boincstats.com/stats/0/user/detail/9652

RichardHaselgrove commented 4 years ago

https://github.com/BOINC/boinc/blob/master/sched/db_dump.cpp#L465 ff looks right. Why isn't TACC using it?

TheAspens commented 4 years ago

IMHO - this feature works differently then people think it does and I think the text around it is misleading.

This feature to "not show hosts" isn't about preventing anyone from seeing the hosts but rather it prevents seeing who owns the hosts. It might be useful to changing the wording on the feature so that it is clearer about what it does.

AenBleidd commented 1 year ago

Closed in favor of #5398