2i2c-org / features

Temporary location for feature requests sent to 2i2c
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Naming conventions for shared folders #4

Open aculich opened 3 years ago

aculich commented 3 years ago

This is a feature request for future hubs.

I suggest the naming convention for shared folders should follow this pattern:

Hope that helps!

choldgraf commented 3 years ago

This makes sense to me, though with one exception, which is that (I think?) it is not currently possible for non-admins to write to any shared folder. If I recall, that was an intentional decision from @yuvipanda but maybe it was instead just a temporary decision? We should get clarity there.

aculich commented 3 years ago

As currently configured on the mills hub, it seems I can write to shared-readwrite but not to shared, however, I'm an admin so maybe that's why I can at least write to one of them? In which case, who has write-access to the shared folder if not the admins?

And weirder still... I just noticed that even though it told me the shared directory is a Read-only file system it actually created my test file in it! That seems like a bug?!

jovyan@jupyter-aculich-40berkeley-2eedu:~$ pwd
/home/jovyan
jovyan@jupyter-aculich-40berkeley-2eedu:~$ ls -la shared*
total 4
drwxr-xr-x 2 jovyan jovyan    6 Jan 21 17:21 shared
drwxr-xr-x 2 jovyan jovyan    6 Jan 21 17:21 shared-readwrite
jovyan@jupyter-aculich-40berkeley-2eedu:~$ ls -la shared
total 0
drwxr-xr-x 2 jovyan jovyan  6 Jan 21 17:21 .
drwxr-xr-x 6 jovyan jovyan 72 Jan 21 18:38 ..
jovyan@jupyter-aculich-40berkeley-2eedu:~$ touch shared/test
touch: cannot touch 'shared/test': Read-only file system
jovyan@jupyter-aculich-40berkeley-2eedu:~$ touch shared-readwrite/test
jovyan@jupyter-aculich-40berkeley-2eedu:~$ find shared*
shared
shared/test
shared-readwrite
shared-readwrite/test
jovyan@jupyter-aculich-40berkeley-2eedu:~$ ls -lah shared-readwrite/test
-rw-r--r-- 1 jovyan jovyan 0 Feb 22 18:45 shared-readwrite/test
jovyan@jupyter-aculich-40berkeley-2eedu:~$ ls -lah shared/test
-rw-r--r-- 1 jovyan jovyan 0 Feb 22 18:45 shared/test
yuvipanda commented 3 years ago

So currently, admins can write to 'shared-readwrite', and it'll show up as 'shared' to everyone else. In other systems, when there was no path different, users have accidentally stepped on each others' foot often by accidentally deleting everything in shared. Hence the different naming conventions. I don't actually think this has been communicated to anyone, nor anyone is currently actively using it - so we can definitely re-engineer this as we wish.

aculich commented 3 years ago

Aha! So really are the same folder, just presented to admins as two different folders. That makes sense. So end users all see only shared.

To finalize setting up the Mills hub in 2i2c-org/infrastructure#178 I've requested that we also set up a private folder just for the admins to share files with other admins-only.

yuvipanda commented 3 years ago

How about this sequence:

  1. shared/mine will be read/write where you can put in whatever you want, and it'll be visible to everyone else
  2. shared/others/<username> will be readonly, and show the shared folders of every other user!
  3. shared/public will be readonly for everyone, and read-write for admins.
  4. (much later) we can do shared/group/<group-name> for read-write access to members of a group

The particular names are up for change, but what do you think of this?

aculich commented 3 years ago

I think we should understand this on a per-use case basis:

Education hub for many official courses

  1. shared/<course-name> will be readonly for everyone, and read-write for admins (instructors and GSIs).
  2. private/<course-name> only visible to admins (instructors and GSIs) with write-access

Mills is currently this kind of hub. Mills has a single hub that serves multiple courses. In this case, all instructors would share a common private area that is only visible, readable, and writable to all of those instructors. All instructors could write in any course folder or even create a new course folder themselves— no technical safeguards so easy to configure and instructors police themselves and resolve issues through some social (not technical) mechanism.

This works fine for a small institution like Mills with 10s of instructors and GSIs, but might need to be adapted for larger institutions with 100s of instructors and 1000s of GSIs.

This use case is also the generic datahub use case where multiple courses and multiple instructors/GSIs all share the same datahub and filename spaces? Isolation

Education hub for a single official course

  1. shared/<course-name> will be readonly for everyone, and read-write for admins (instructors).
  2. private/<course-name> only visible to admins (instructors).

As I understand it, this is how some official UCB courses such as cs194 operate? They have their own hub (is the underlying cluster shared or is it separate?) with their own image and their own set of instructors/GSIs that is isolated from the rest of the datahub.

Training hub for workshops

  1. shared/<workshop-name> will be readonly for everyone, and read-write for admins (instructors).
  2. private/<workshop-name> only visible to admins (instructors).
  3. datasets/<dataset-name> will be readonly for everyone, and read-write for admins (instructors). Data set names are independent from the name of workshops which may reference the data sets.

This is the D-Lab use case. This is similar to the generic datahub/Mills use case in which many D-Lab workshops run on a single hub with all users having access to the share spaces of any workshop (whether or not they are taking the workshop). We add a special datasets directory separate from the individual workshop, as we may have some (especially large) datasets which live in a global namespace that may be referenced by multiple workshops.

Research hub for multiple group-based research projects

  1. (much later) we can do shared/group/<group-name>/<project-name> for read-write access to members of a group
  2. (much later) we can do shared/group/<group-name>/<dataset-name> for read-only access to members of a group for specific datasets shared/group/<group-name>/<dataset-name>
  3. private/datasets/<dataset-name> this will only by visible and read-write for admins (data curators). Data set names are independent from the name of projects which may reference the data sets. These are private datasets that get mounted to one or more groups such as
  4. datasets/<dataset-name> will be readonly for everyone, and read-write for admins (project maintainers). Data set names are independent from the name of projects which may reference the data sets. These are global data sets, whereas individual projects may have their own private datasets in their group-project directory from 1 above.

This is another D-Lab use case. We have a single hub (possibly combined together with the Training hub use case, as well).

Research hub for single group with multiple projects

  1. shared/<project-name> will be readonly for everyone, and read-write for admins (group member).
  2. private/<project-name> only visible to admins (group admins).
  3. datasets/<dataset-name> will be readonly for everyone, and read-write for admins (group admins). Data set names are independent from the name of workshops which may reference the data sets.

This is another D-Lab use case. This would also work for a Discovery or URAP use case, or faculty research project/group. This is also much simpler than the previous use case from a per-hub perspective. There is just more overhead in setting up a new hub for each new group who needs one. The one group in this case has multiple projects and the assumption is that all people in a group have 100% access to all the shared/global spaces. If a single group needs to have differential access for members, then we would create a separate hub for a different group to keep this a very simple model.

yuvipanda commented 7 months ago

Tracked in this feature: https://2i2c.productboard.com/feature-board/7803674-product-ideas/features/25937913/detail

aculich commented 7 months ago

@yuvipanda when I try that link I get:

This board can't be found It was either deleted or the link you have might be broken.

So maybe the productboard is internal to 2i2c only?

yuvipanda commented 7 months ago

@aculich yeah, we're working on figuring out how to make sure it's publicly visible! Hold on :)