IQSS / dataverse

Open source research data repository software
877 stars 484 forks source link

How many datasets do I have? Are these 163 more things really My Data? #5880

Closed pdurbin closed 1 month ago

pdurbin commented 5 years ago

From discussion in #3038 I know we don't plan to fix anything in "My Data" any time soon but I was just talking to @TaniaSchlatter and @djbrooke about #5874 and I'd like to point out what I consider to be a usability problem or at least a point of confusion.

When I first encounter the term "My Data" after clicking my name in the top right corner like this...

Harvard_Dataverse_-_2019-05-23_11 39 40

... I'm expecting to see my stuff, my data. However, I see a bunch of stuff I've never heard of:

Here's how it looks:

Account_-_Harvard_Dataverse_-_2019-05-23_11 39 57

I find this very confusing and disorienting. I thought this was suppose to be my data. Because I'm familiar with this feature, I know that if I simply uncheck "File Downloader" under "Roles" I get a list of my actual data. This is all stuff I uploaded to Harvard Dataverse:

Account_-_Harvard_Dataverse_-_2019-05-23_11 40 29

I can see at a glance, for example, that I have two dataverses and that one of them is published and one of them is unpublished. This makes me happy. It's my data. 😄

All of this is a long visual way of saying that perhaps when someone visits My Data the "File Downloader" role should be unchecked by default. Otherwise the experience is confusing and perhaps even a little frustrating.

Alternatively we could rename the feature from "My Data" to "Stuff That Someone You May Or May Not Know Has Given You Access To Mixed In With Your Data". 😄

TaniaSchlatter commented 5 years ago

I ran into this yesterday when testing a different feature. I created a new account, and once the account was created, I got the same list of unfamiliar results on a "My Data" view as the first thing I saw as a new account holder. I assume that there were reasons made at some point for what is shown by default. We should find out more about what users expect, and depending on what we learn, possibly revisit the current functionality and rationale.

dlmurphy commented 5 years ago

I plan to do some research to figure out which factors influence what appears on the "My Data" page. I discussed with @pdurbin today and our current hypothesis is that creating a new account via Harvard shibboleth causes you to automatically have "file downloader" role on any dataset that grants that role to all Harvard shibboleth users, thus filling your "My Data" page with them.

Regardless of whatever my research uncovers, I do agree with Phil that it's self-evidently helpful to not auto-select the "File Downloader" role on the My Data page, as these datasets could in no way be construed as "your data".

dlmurphy commented 5 years ago

Above hypothesis confirmed. I created an account using Harvard shibboleth and reproduced Phil's situation: my "My Data" page was full of datasets I had the "File Downloader" role on. I checked on these datasets and they had this role assigned to ALL Harvard shibboleth users. Here's a screenshot showing this in the Roles table of one of these datasets:


So: anyone using the Harvard University institutional login has 169 (and counting) mostly irrelevant dataverses and datasets showing up on their My Data page just because someone set those dataverses and datasets to allow file downloads for all users affiliated with Harvard.

dlmurphy commented 5 years ago

I've created a document here that sums up the research for this issue.

dlmurphy commented 5 years ago

3185 is relevant to this issue and includes a request for one possible way of addressing the problem.

dlmurphy commented 5 years ago

In #5965 I added a bit to the "My Data" section of the user guide to explain this phenomenon:

If you see unexpected dataverses or datasets in your My Data page, it might be because someone has assigned your account a role on those dataverses or datasets. For example, some institutions automatically assign the "File Downloader" role on their datasets to all accounts using their institutional login.

This doesn't solve the problem by any means, but at least there's an explanation of it in our documentation now.

cmbz commented 1 month ago

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

pdurbin commented 1 month ago

This is still a usability problem but I'll simply refer to this closed issue when someone else brings it up. 🤷