kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.19k stars 39.43k forks source link

Storage plugins (was PreStart pod hook required) #5388

Closed cnaize closed 9 years ago

cnaize commented 9 years ago

Hello! We want to mount pod volumes to a remote filesystem and we need to mount that filesystem to a node before a pod is started. Where are postStart and preStop hooks, but we need preStart.

satnam6502 commented 9 years ago

@davidopp ? Are you the right person to answer this?

dchen1107 commented 9 years ago

cc/ @vishh

satnam6502 commented 9 years ago

Please re-assign reviewer as appropriate. Thank you.

vmarmol commented 9 years ago

Hi @cnaize, we've talked about having a preStart hook for pods. This would run before your pod starts, but within the container it will start it. Will this work for your usecase? My only concern is that you may not have privileges to mount the filesystem. That would have to be in the Kubelet through the volume drivers.

(re-assigning to me)

vishh commented 9 years ago

@vmarmol: We support PostStart and PreStop currently. FWIU @cnaize's use case is to run a command on the 'node' before a pod starts, which translates to adding a PreStart hook along with the existing ones.

vmarmol commented 9 years ago

@vishh yes the PreStart hook would run in the container, which may not work today for mounting filesystems. The right way to do this seems like the volumes on the pod, similar to how we handle PD today.

bliss commented 9 years ago

Exactly. We need to mount a pod shareable volume to a remote filesystem and this hook required to learn which node mount that (i.e. rbd) device to

vmarmol commented 9 years ago

The pre-start part I think is covered by #140 for the storage part why filesystem are you looking to mount?

eparis commented 9 years ago

So we shouldn't be using a PreStart hook for filesystems. We should be designing it into #4055 ....

bliss commented 9 years ago

Basically we need to keep a pod data on remote filesystem and give access to a user to his data even if a pod (or a node enrirely) is gone. Remote filesystem is the mosst obvious way to achieve it

dchen1107 commented 9 years ago

cc/ @thockin

We talked about supporting new Volume source type, but this one is definitely not in our v1 roadmap. Are we plan to include this? I would like to say no. If we come up a design for supporting adds-on volume source type, there is no need for PreStart hook, especially introducing a new one for Pod.

vmarmol commented 9 years ago

@bliss if the data is read-only you can have a data-only volume. If it is r/w then today if you're in GCE we have PD. Otherwise, adding support for more volumes is still in the works (as @dchen1107 mentioned).

Thanks @eparis! I tried looking for an issue dealing with new volume types but couldn't find any.

bliss commented 9 years ago

@vmarmol Is there any possibility to use volume containers with kubernetes? As far as I know 'volume-from' is not supported by kubernetes. At any rate when a node is gone a data-volume container is gone too. Basically all we want is to know beforehand which node a pod assigned to and have a little time to prepare mountpoint for it.

vmarmol commented 9 years ago

@bliss yeah unfortunately the support we have right now is: PD and git volume. You can alternatively fetch it from remote HTTP, but that's not great.

You could also have a hack where you specify an empty dir and share it between two containers the running one and the data one. Have the data one make the data available through the empty dir to the other container (copy, maybe symlink). But hacky...

bliss commented 9 years ago

@vmarmol thank you for your replies! But unfortunately our case is having access to pod data even when all nodes are dead. This is achievable with placing pod volume on a remote filesystems only. So the question is: how we can know beforehand a pod starts what node it is assigned to. :)

vmarmol commented 9 years ago

You can probably query the API server. Once the scheduler makes a decision it'll be reflected on the pod's host but not yet be in a running state.

I think what you're looking for is expanded volume support though.

vishh commented 9 years ago

Watching the API server is racy. Having a volume type that handles your use case is the way to go.

On Thu, Mar 12, 2015 at 12:23 PM, Victor Marmol notifications@github.com wrote:

You can probably query the API server. Once the scheduler makes a decision it'll be reflected on the pod's host but not yet be in a running state.

I think what you're looking for is expanded volume support though.

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/5388#issuecomment-78580000 .

bgrant0607 commented 9 years ago

Volume plugin work is ongoing (e.g., #5167). Take a look at: https://github.com/GoogleCloudPlatform/kubernetes/blob/master/pkg/kubelet/volume/plugins.go

PRs to push this along would be welcome.

You could also run powerstrip.

thockin commented 9 years ago

Volume plugins provide the abstraction you want, and they are very easy to implement. The hard part is the API, which is not really plugin-enabled yet.

We dont really want to support every random technology in the core of kubernetes, but maybe you can tell me more about your flavor of rbd? How is it mounted and addressed? If it is really generic, we could consider supporting it in the core.

Tim On Mar 13, 2015 4:40 AM, "Nikita" notifications@github.com wrote:

@bgrant0607 https://github.com/bgrant0607 Can you provide some details about plugins? Main design and how to use.

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/5388#issuecomment-78869341 .

markturansky commented 9 years ago

I believe Persistent Volumes (#4055) solves much of what is being asked, except for the requirement "our case is having access to pod data even when all nodes are dead."

Manual workarounds exist for that requirement. The volume is a pet, so visiting with it is possible. It breaks the abstraction between the administrator and user, but that is moot if they are the same person.

Accessing data in a PV through Kubernetes is not possible without a running pod that has mounted that volume.

cnaize commented 9 years ago

@thockin In future it will be very good to have plugin-enabled API and we'll write just plugins for different types of storage. But rbd support right now is greate. http://ceph.com/docs/master/man/8/rbd/

cnaize commented 9 years ago

If I'll have developed this, will you merge my pull request?

thockin commented 9 years ago

Maybe. I have spoken with ceph people before, and I think it is pretty desirable. That said, I focused on file storage not block. Will read docs on it. On a plane now. On Mar 13, 2015 11:03 AM, "Nikita" notifications@github.com wrote:

If I'll have developed this, will you merge my pull request?

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/5388#issuecomment-79136793 .

thockin commented 9 years ago

I think ceph is an important enough system that we should be willing to support it. Now, ideally we could support it without being linked in, and when we get there we can consider that. But if someone wrote a volume driver for ceph, it should integrate with Mark's work (which is P0 for me now that I am back :) very easily.

On Fri, Mar 13, 2015 at 10:58 AM, Tim Hockin thockin@google.com wrote:

Maybe. I have spoken with ceph people before, and I think it is pretty desirable. That said, I focused on file storage not block. Will read docs on it. On a plane now. On Mar 13, 2015 11:03 AM, "Nikita" notifications@github.com wrote:

If I'll have developed this, will you merge my pull request?

— Reply to this email directly or view it on GitHub https://github.com/GoogleCloudPlatform/kubernetes/issues/5388#issuecomment-79136793 .

goltermann commented 9 years ago

We’re going through old support issues and asking everyone to direct your questions to stackoverflow.

We are trying to consolidate the channels to which questions for help/support are posted so that we can improve our efficiency in responding to your requests, and to make it easier for you to find answers to frequently asked questions and how to address common use cases.

We regularly see messages posted in multiple forums, with the full response thread only in one place or, worse, spread across multiple forums. Also, the large volume of support issues on github is making it difficult for us to use issues to identify real bugs.

The Kubernetes team scans stackoverflow on a regular basis, and will try to ensure your questions don't go unanswered.