Open guillaumeeb opened 2 years ago
Thank you @guillaumeeb
- Dataset should be only visible to some users? yes.
- Dataset should be only writable by some users, but can be viewed and read by anyone? no.
- What do you mean between owner and group?
In case of linux system, in HPC center, we create unix group and add some users, who want to share share data, in that unix group. And there we control it with chmod g+r o-r toto.nc In our case, other, is internet. group, can be the all the people have EGI autherised access for Pangeo cloud, or Pangeo cloud admin group. My question is how do we create a bucket only accessible with these group of people but not from internet, with our EGI authentification system.
OK, so I think we'll need @sebastian-luna-valero's help on this one, and probably some of CESNET staff also. I can still try to answer some points.
There is no such thing of user:group concept in Cloud and object store, things are different. You've got user accounts (EGI here), projects or tenants (Pangeo VO I guess), and you can usually define roles and policies with all that. These policies are kind of ACL (Access Control List): they define who can perform which operation on a Project or on a Bucket/Container. I'm not sure how this is implemented in CESNET, but there I checked in the doc that it is possible to use something like this on Openstack.
By default, with Horizon interface or during bucket/container creation, we can only specify is a container is public (visible on internet) or not. So the situation is as below I think:
Be careful: if you create an S3 Access/Secret keys pair, and give it to another person, it will be by default a admin keys pair.
So to know if we can set more precise rules, we'll need help from other people to know which Openstack command we could type, and if this is compatible with S3 or only Swift credentials.
Hello,
Here is the current situation:
Please note that currently:
If we need something intermediate, we will need to explore options in: https://docs.openstack.org/swift/latest/overview_acl.html
Please let me know your thoughts.
Best regards, Sebastian
Hi Sebastian,
The use case I have in mind requires 'something intermediate'.
I'll have some users who does not require OpenStack dashboard access. But requires DaskHub, and requires 'private' buckets only for these users. It is ok that Pangeo VO admins access to these datas as they are admins.
I have related questions to @sebastian-luna-valero. If we use s3 access through MinIO server proposed at IM Dashboard, do we have different type of user groups? Or as it will be anyway backed up with EGI check-in for user control, it is same as using openstack object storage directly from CESNET?
Hi,
To address this issue I have opened: https://github.com/pangeo-data/pangeo-eosc/pull/23
Here is the status after merging that PR:
DaskHub
)? Members of the pangeo.admins
VO group in aai.egi.eu
DaskHub
? Members of the vo.pangeo.eu
VO in aai-dev.egi.eu
. Ideally we want this to be moved to aai.egi.eu
as well.vo.pangeo.eu
VO in aai.egi.eu
Now, following instructions to configure awscli users that want private
buckets should be able to do that using --acl private
with aws s3
commands.
All of the above should address the comments from @tinaok
The use case I have in mind requires 'something intermediate'. I'll have some users who does not require OpenStack dashboard access. But requires DaskHub, and requires 'private' buckets only for these users. It is ok that Pangeo VO admins access to these datas as they are admins.
Regarding the question about MinIO. If you deploy it with IM Dashboard you have full control over it (i.e. you can decide to configure EGI Check-In or any other user accounts/groups). However, please bear in mind that it's not only about deploying and configuring MinIO, it will be also another service to be maintained by us. Therefore, I would leave this as last resort, and use the object storage at CESNET that is already managed.
Now, following instructions to configure awscli users that want private buckets should be able to do that using --acl private with aws s3 commands.
So what you are saying here, is that once we've setup our AWS S3 credentials, we can use aws s3
commands, following https://docs.aws.amazon.com/cli/latest/reference/s3api/put-object-acl.html, to position specific ACLs on any storage bucket/container?
I'll try that later on this week or the next.
However, please bear in mind that it's not only about deploying and configuring MinIO, it will be also another service to be maintained by us. Therefore, I would leave this as last resort, and use the object storage at CESNET that is already managed.
:+1: about this, handling our own object store would certainly be some work. And we'll also probably run into performance concerns.
So what you are saying here, is that once we've setup our AWS S3 credentials, we can use aws s3 commands, following https://docs.aws.amazon.com/cli/latest/reference/s3api/put-object-acl.html, to position specific ACLs on any storage bucket/container?
I have only tested the --acl private
option. Being OpenStack Swift underneath I am not sure whether all the AWS S3 options will be supported. Please test and let us know.
Could you just clarify a bit how you see the storage permissions using S3 interface after #23, so with containers/buckets created in another Openstack project?
following https://github.com/pangeo-data/pangeo-eosc/issues/39#issuecomment-1277671176
What shall we tell students to do to avoid that one student delete another student's data ?
All students, I'll add them in member of vo.pangeo.eu in aai.eu , so that they can read/write in private bucket that I'll create for each working group.
But if I understood right, unlike HPC centres, that if one user make Zarr file, other user, they can delete this Zarr file by mistake?
Until we find solutions, I'll explain them to 'check the path' so do not touch other's file, but if we can find better solution it would be nice. I wonder how Pangeo US cloud are dealing with this....
Hi,
The problem is with the translation of the federated identity from Check-In into the local identity at CESNET. This issue is very specific to the federated AAI infrastructure that we are using for this deployment. If other deployments use other authentication/authorization methods, they won't have the same issue.
Indeed, the recommendation until the issue is solved is to be careful with the path. As long as everybody writes on their own bucket/path, everything should be fine. Maybe they can use their own user ID as a prefix? Hopefully that's unique to everyone.
Apologies, CESNET has been looking into the issue, but it's not an easy one to solve.
I believe this has been fixed with MinIO. Do you want to test or should we directly close this?
Thank you @sebastian-luna-valero, yes I would like to test it to understand the procedure, which documentation I should follow? Thank you for your help.
Hi @tinaok
This is the starting point: https://github.com/pangeo-data/pangeo-eosc/blob/main/users/users-getting-started.md#access-minio
Please give it a go and let us know how it goes.
Best regards, Sebastian
Thank you @sebastian-luna-valero, I couldn't create a bucket, may be because I'm not connected as administrator?
Tina
Could you try following these steps?
https://github.com/pangeo-data/pangeo-eosc/blob/main/users/how-to/TestMinIO.ipynb
I think we should link the example from the getting started guide: https://github.com/pangeo-data/pangeo-eosc/pull/56
Question by @tinaok:
@tinaok could you precise a bit your need?