GoogleContainerTools / skaffold

Easy and Repeatable Kubernetes Development
https://skaffold.dev/
Apache License 2.0
14.93k stars 1.62k forks source link

Two-way sync #2492

Open balopat opened 5 years ago

balopat commented 5 years ago

Syncing in Skaffold to the container works well. However, applications sometimes change files on the filesystem themselves. If this happens inside the container, it is hard to get that change quickly during development.

There might be two types of file changes: 1) inside the code repository, e.g. under the project files in an IDE 2) outside of the code repo, e.g. in /var/log/, or somewhere else

Mounting directories is tricky and slow, also can collide with Skaffold's one directional filesync. Maybe Skaffold should be able to help with syncing back from the container in dev mode.

I would like to collect feedback from users who face this and see what use cases are there.

wstrange commented 5 years ago

Our use case: We have a complex application with 100+ json configuration files. These files are updated by the application UI running in the container. To save them in git we need to get them back out to the developers environment.

We have tried using minikube volume mounts for this use case. It works- but has some drawbacks - such as complexity (we want to keep things simple for the developer), it doesn't work on non minikube environments, and you can get into a sync loop race with skaffold as files are updated.

Today we use kubectl cp to pull the files out of the container. It is low tech - but is working well enough. A little bit of chrome / ease of use around this workflow would be nice.

antl3x commented 5 years ago

Our use case: We have a complex application with 100+ json configuration files. These files are updated by the application UI running in the container. To save them in git we need to get them back out to the developers environment.

We have tried using minikube volume mounts for this use case. It works- but has some drawbacks - such as complexity (we want to keep things simple for the developer), it doesn't work on non minikube environments, and you can get into a sync loop race with skaffold as files are updated.

Today we use kubectl cp to pull the files out of the container. It is low tech - but is working well enough. A little bit of chrome / ease of use around this workflow would be nice.

Great to hear from you. We are facing the same challenge. The problem with the kubectl cp is that we need to give the pod name, which is dynamic.

So kubectl cp works only manually.

wstrange commented 5 years ago

The problem with the kubectl cp is that we need to give the pod name, which is dynamic.

We use selector labels on kubectl cp (-l app=myapp)- so the shell script is reasonably generic.

strikeout commented 5 years ago

Can we take some inspiration by how DevSpace solved bi-directional host<>pod sync? They inject a small binary into the container which in turn scans and passes a list of files not present on the host.

https://github.com/devspace-cloud/devspace/tree/master/sync/inject

DevSpace establishes a bi-directional code synchronization between the specified local folders and the remote container folders. It automatically recognizes any changes within the specified folders during the session and will update the corresponding files locally and remotely in the background. It uses a small helper binary that is injected into the target container to accomplish this. The algorithm roughly works like this: Inject a small helper binary via kubectl cp into the target container Download all files that are not found locally Upload all files that are not found remotely Watches locally and remotely for changes and uploads or downloads them

dsebastien commented 4 years ago

I don't know if my use case is valid as I'm just starting to use K8S, but I currently have a multi-stage docker build for two micro-services which need to be deployed to K8S.

I'm using Minikube and plan to deploy the "dev" stage of my dockerfiles to K8S. That "dev" stage's goal is to provide me with an environment in which I have all the necessary dependencies and tools to build (node, npm, etc). As such my goal is to develop, build and run from within K8S.

This setup was working fine with docker + docker-compose, where I could bind-mount my local source code into the containers and have it sync both ways.

At the moment I'm looking at solutions for this with K8S, but have found quite a lot of different things so far.. :)

Skaffold looks really appealing for multiple reasons, but at the moment I'm thinking about combining skaffold and devspace to reach my goals. If Skaffold could do it all, then all the better :)

paolomainardi commented 4 years ago

Maybe I am missing something, but should’t be easy to just using an hostpath volume as docker does for local development ?

maluio commented 4 years ago

@paolomainardi Mounting the volume using a hostpath works. However, I run into a lot of problems due to user id mapping between the user in the container and the user on the host system. There is a way to handle this mapping properly, but it is IMHO non-trivial to set up https://docs.docker.com/engine/security/userns-remap/

In case I am missing something a nudge into the right direction is highly appreciated.

EnziinSystem commented 4 years ago

Two-way file synchronization is really necessary. In development mode, we often have to create models through scaffold programs such as migrate files, controllers, etc.

tstromberg commented 4 years ago

Two-way sync can be especially tricky with remote clusters, where reverse mounts are not easily done.

I haven't heard about anyone actively planning on working on this, so I'm going to downgrade the priority. If someone wants to propose a design document, I'd love to take a look.

EnziinSystem commented 4 years ago

@tstromberg

Two-way sync can be enabled locally for development mode, not a remote cluster.

For example:

I am a developer Ruby on Rails, I need to generate a model.

rails generate scaffold Post name:string title:string content:text

Many files generated in the pod/container, but I cannot copy them by the manual.

pierreyves-lebrun commented 4 years ago

Though tricky, two-way sync is essential as people developing in containers simply can’t use Skaffold at the moment.

Solutions such as Okteto or Devspace do have that feature so developers don’t have to pollute their local machine with dev dependencies.

simonoff commented 4 years ago

Guys! Such feature is a must have for full local development inside Kubernetes. I have reviewed the ways how to implement it - and only one way looks is better for me - inject a sidecar container which will do such sync. In similar way how the devspace doing it. What do you think?

tstromberg commented 4 years ago

@simonoff - Personally speaking, sidecars seem like a very reasonable approach.

MattShirley commented 4 years ago

This feature would also be helpful to me.

My scenario is similar to the other posts. I have a Python/Django application and want to generate database migrations, which requires an active connection to the database.

Similar to other requests, this is only important for local development on minikube and is not needed for remote clusters.

ncri commented 3 years ago

The problem with the kubectl cp is that we need to give the pod name, which is dynamic.

We use selector labels on kubectl cp (-l app=myapp)- so the shell script is reasonably generic.

@wstrange How would you do this? Do you have a full example? I don't see the option to use a label when in the cp examples (kubectl cp --help)

wstrange commented 3 years ago

The problem with the kubectl cp is that we need to give the pod name, which is dynamic.

We use selector labels on kubectl cp (-l app=myapp)- so the shell script is reasonably generic.

@wstrange How would you do this? Do you have a full example? I don't see the option to use a label when in the cp examples (kubectl cp --help)

Apologies - our shell script gets the pod name first using the label selector (in our case there is only pod in dev) - then we invoke cp

ncri commented 3 years ago

@wstrange Ah, got it, thanks.

MarlonGamez commented 3 years ago

@tejal29 I noticed this is assigned to you. Do you think we'll get to this in an upcoming milestone? If not I think we should bump the priority down

Rotendahl commented 3 years ago

+1 for the feature request. It would make for a nicer development experience not having to extract migrations, package manifests and other files generated by code from the container.

tejal29 commented 3 years ago

Sorry folks, reducing the priority for this as we don't have plans to work on this next quarter.

nkubala commented 2 years ago

just want to leave another comment from the team - this FR would clearly be a good addition to skaffold, but at the moment we unfortunately don't have the bandwidth to prioritize this on our end.

if anyone would like to take a shot at designing and building this, please reach out and someone from the team can help provide guidance and design/code review!

mecampbellsoup commented 2 years ago

@wstrange How would you do this? Do you have a full example? I don't see the option to use a label when in the cp examples (kubectl cp --help)

You'd just have to compose shell commands e.g.:

for podname in $(kubectl get pods -l name=myapp -o json| jq -r '.items[].metadata.name'); do kubectl cp "${podname}":/tmp ${podname}; done
mecampbellsoup commented 2 years ago

@tejal29 any update on timing yet?

gsquared94 commented 2 years ago

@mecampbellsoup we are not able to prioritize this feature immediately. Will update again in a few weeks. In the meantime if anyone from the community wants to work on this, let us know.

mecampbellsoup commented 2 years ago

@mecampbellsoup we are not able to prioritize this feature immediately. Will update again in a few weeks. In the meantime if anyone from the community wants to work on this, let us know.

Sure - how would you all suggest it be implemented?

ericzzzzzzz commented 2 years ago

Hi @mecampbellsoup Please start with a design doc. However, we're not able to provide guidance on the implementation at the moment.

JoseMiralles commented 2 years ago

This feature made me really like docker-compose. But being able to accomplish this with Skaffold as seamlessly as it is with docker-compose would be a dream. Having to juggle kubernetes yml files alongside docker-compose files feels really hacky.

aaron-prindle commented 1 year ago

This has recently been put on the Skaffold team's roadmap to be addressed in the first half of 2023. @ericzzzzzzz is currently investigating solution as to the requirements here and potential solutions. If anyone in the thread here has any ideas for requirements or potential solution please post your insights/concerns here.

strikeout commented 1 year ago

This has recently been put on the Skaffold team's roadmap to be addressed in the first half of 2023. @ericzzzzzzz is currently investigating solution as to the requirements here and potential solutions. If anyone in the thread here has any ideas for requirements or potential solution please post your insights/concerns here.

Looking at how a competitor solved this: (re-quoting my post in this thread)

DevSpace establishes a bi-directional code synchronization between the specified local folders and the remote container folders. It automatically recognizes any changes within the specified folders during the session and will update the corresponding files locally and remotely in the background. It uses a small helper binary that is injected into the target container to accomplish this. The algorithm roughly works like this: Inject a small helper binary via kubectl cp into the target container Download all files that are not found locally Upload all files that are not found remotely Watches locally and remotely for changes and uploads or downloads them

pierreyves-lebrun commented 1 year ago

This has recently been put on the Skaffold team's roadmap to be addressed in the first half of 2023. @ericzzzzzzz is currently investigating solution as to the requirements here and potential solutions. If anyone in the thread here has any ideas for requirements or potential solution please post your insights/concerns here.

Looking at how a competitor solved this: (re-quoting my post in this thread)

DevSpace establishes a bi-directional code synchronization between the specified local folders and the remote container folders. It automatically recognizes any changes within the specified folders during the session and will update the corresponding files locally and remotely in the background. It uses a small helper binary that is injected into the target container to accomplish this. The algorithm roughly works like this: Inject a small helper binary via kubectl cp into the target container Download all files that are not found locally Upload all files that are not found remotely Watches locally and remotely for changes and uploads or downloads them

Also worth mentioning Okteto: https://www.okteto.com/docs/reference/file-synchronization/

renzodavid9 commented 1 year ago

Update about this issue: currently we had to deprioritize the work for this feature; it will imply a good amount of changes. We are aiming to resume work for this as soon as we have more cicles available for the work (hopefully some point in h2 2023). Thanks everyone for your inputs.

bhack commented 1 year ago

@renzodavid9 I made a recap here on why it is hard to develop on Vscode + Kubernetes pod also with Skaffold at https://github.com/devcontainers/spec/issues/10#issuecomment-1559239926

What do you think?

samjurriaans commented 1 year ago

Is there any chance that this features will be worked on in h2 2023 by the core team?

Cannot wait to have this feature in skaffold!

Are there any other Laravel developers using kubernetes and skaffold for dev environments who deal with this issue?

euven commented 1 year ago

Any update here? This is the only feature preventing us from using skaffold...

shadiramadan commented 10 months ago

I’m surprised this isn’t a higher priority issue.

shadiramadan commented 10 months ago

@samjurriaans this is an issue for any workflow that has configuration data generated by some process CLI/UI that runs on the container.

That data is necessary to sync in multi env environments so it is not wiped when restarting skaffold…

ericzzzzzzz commented 7 months ago

Hi friends, I created a poc for downstream-sync https://github.com/ericzzzzzzz/skaffold/tree/downstream-sync , we'd like some feedbacks to formalize the design, if you can help to try out, that would be great! thanks!

To use this, you need to clone https://github.com/ericzzzzzzz/skaffold/tree/downstream-sync and build from source (You may want to back up your exiting skaffold binary first :) )

shadiramadan commented 5 months ago

Hello @ericzzzzzzz I will test this out! Beyond testing what does the path look like for getting this feature in?

shadiramadan commented 5 months ago

I have developers that get confused by having to remember copying changes back to their filesystem due to this issue so I'm hoping to improve DX through this change!

ericzzzzzzz commented 5 months ago

Hi @shadiramadan , Our team has placed the Skaffold project in a maintenance state (KTLO). This means we won't be actively adding new features ourselves. However, we wholeheartedly welcome community contributions in the form of pull requests. To facilitate this, I've completed a design document for this downstream file-sync feature. I'll share this document next week and create issues outlining the feature's MVP, making it easier for the community to get involved

shadiramadan commented 5 months ago

@ericzzzzzzz would Google consider donating the project to the CNCF if that is the case? I'm not sure what that process is like but it could give the project the boost it deserves. Skaffold has come a long way and there are definitely some quality of life changes that would make the DX more friendly and increase adoption!

bendory commented 5 months ago

Hello @shadiramadan , I'm the Engineering Manager for the Skaffold team at Google.

We would very much consider donating Skaffold to the CNCF -- especially if doing so means more community involvement from folks like you! Any interest in partnering with us on working through that process?

bhack commented 5 months ago

What are the next steps?

bhack commented 5 months ago

I suppose you need to open a ticket at https://github.com/cncf/sandbox right?

bhack commented 5 months ago

P.s. Google Kaniko recently it is trying to do the same: https://github.com/cncf/sandbox/issues/88

bendory commented 5 months ago

@bhack Correct on both counts. We would like to hear from Skaffold users and contributors interested in working on and supporting such an application!

bhack commented 5 months ago

Can you open and pin a new ticket on this?

ericzzzzzzz commented 5 months ago

Here is the draft design proposal for downstream sync, https://github.com/ericzzzzzzz/skaffold/blob/downstream-sync/docs-v2/design_proposals/downstream-sync.md

shadiramadan commented 5 months ago

Thank you @ericzzzzzzz I'll review the design proposal and your existing work. No promises on my own timelines but this has been a TODO on my end for a while!