To start VS Code, use `post-start` rather than overriding the entrypoint

l0rd commented 1 year ago

Is your enhancement related to a problem? Please describe

VS Code che-code-runtime-description component contributes the container command to the tooling container:

(...)
  - schemaVersion: 2.1.0
    metadata:
      name: che-incubator/che-code/insiders
    (...)
    components:
      - name: che-code-runtime-description
        attributes:
          controller.devfile.io/container-contribution: true 
        container:
          (...)
          command:
            - /checode/entrypoint-volume.sh

As a result, the script /checode/entrypoint-volume.sh is executed at workspace startup, but the tooling container entrypoint (for example UDI's) is not.

This is problematic as the tooling container entrypoint may be used to initialise development tools and runtimes.

Describe the solution you'd like

We should use a post-start event rather than the command to start VS Code.

That should be tested though and may require some tweaks to the VS Code startup script (or not).

This is currently VS Code specific but will JetBrains and Theia may be affected too if this feature is implemented in the meantime.

l0rd commented 1 year ago

@RomanNikitenko I have been testing:

1. With this editor definition that uses args instead of command (link to start it on dogfooding instance) and, after I set secure: false and protocol: https in the main endpoint, I was able to start the workspace

2. With this editor definition that uses a poststart event (link to start it on dogfooding instance) but it's failing with message

Error processing devfile: failed to process postStart event inject-and-start-vscode: container component with name che-code-runtime-description not found

and I noticed that the DWT doesn't have the inject-and-start-vscode component indeed (the dashboard filters it out)

RomanNikitenko commented 1 year ago

thank you, Mario, for sharing your experience related to the problem!

Unfortunately using args instead of command doesn't work for me. Please see the video.

https://user-images.githubusercontent.com/5676062/215450605-76090dcd-19e3-4505-80cc-8f577f9d53e4.mp4

RomanNikitenko commented 1 year ago

About the second option.

Yes, before our meeting I tried to set the command for the che-code-runtime-description component and faced with the same error.

Then I went forward and tried to set the command for the tools container (just for testing). That time there was no the error container component with name che-code-runtime-description not found, but, as I mentioned on our meeting, the workspace hung on the Waiting for workspace to start step. After the meeting I noticed that it was because of

I continue to investigate the problem...

RomanNikitenko commented 1 year ago

Update for https://github.com/eclipse/che/issues/21879#issuecomment-1408392327

It looks like the entrypoint is not executed when args is used in the yaml file. For such use case the workspace has started successfully when I manually run the entrypoint from the terminal:

https://user-images.githubusercontent.com/5676062/215497802-0acc140b-44e7-49ea-82f5-cbee17040cc5.mp4

RomanNikitenko commented 1 year ago

depends on https://github.com/eclipse/che/issues/21971

RomanNikitenko commented 1 year ago

Today I've tested on the dogfooding instance creation of the workspace using Create DevWorkspace form of the openshift console.

the content of my DevWorkspace is placed here
it uses a sample with merge-contribution attribute
there is corresponding command bound to the postStart event
container-contribution attribute is defined for the che-code-runtime-description component

The result is: the error Error processing devfile: failed to process postStart event init-che-code-command: container component with name che-code-runtime-description not found

So, I guess, some changes are required on the devworkspace-operator side to have possibility to define a command for the che-code-runtime-description component and use it for the postStart event.

amisevsk commented 1 year ago

DWO issue https://github.com/devfile/devworkspace-operator/issues/1029 has an impact on this. I've opened a PR to fix this in upstream DWO: https://github.com/devfile/devworkspace-operator/pull/1033

RomanNikitenko commented 1 year ago

I tested an ability to use a post-start event for starting VS Code editor after merging the corresponding changes to the devworkspace-operator (see https://github.com/eclipse/che/issues/21879#issuecomment-1416308513).

The result is:

the workspace creation hangs on the Starting phase, the message is: Waiting for workspace deployment
the tooling container is in Waiting state
then it fails with the error: devworkspace failed to progress past phase 'Starting' for longer than timeout
the corresponding pod hangs for hours in the Terminating state
the next attempt to create the workspace - leads to Shared PVC is in terminating state error

@amisevsk helped me to find the cause of the problem:

Kubernetes won't mark containers as running until the post-start hook terminates
the PVC deleting issue is related to https://github.com/kubernetes/kubernetes/issues/30042 (I had to exec to the corresponding pod and kill che-code processes manually)

So:

the entrypoint should be run in the background if we run it as the post-start event
there is a downside - with the post-start event logs are only included in pod logs while the command is running

I'm working on it ^

@amisevsk thanks a lot for your investigation/help!

amisevsk commented 1 year ago

there is a downside - with the post-start event logs are only included in pod logs while the command is running

In addition -- I found that I had to redirect the entrypoint-volume.sh logs to a file (and not use stdout), as closing stdout on completion of the script caused the entrypoint-volume.sh script to fail prematurely. We can maybe work around this somehow.

cgruver commented 1 year ago

FWIW, this issue is also preventing users from using an alternative shell since the entrypoint-volume.sh script is also creating the user's entries in /etc/passwd and /etc/group

A user can change the shell to, for example, zsh, after the workspace starts, but it's an added step every time the workspace is started.

amisevsk commented 1 year ago

Historically, it was (maybe still is) necessary to create entries in /etc/passwd and /etc/group, as OpenShift clusters run pods with semi-random UIDs. This can result in various things/tools breaking -- see e.g. this old issue

If I remember right when I configured a web-terminal-tooling image to use zsh, it was possible to set the shell by exporting the SHELL env var in the dockerfile, but it has been a while so I could be wrong here.

l0rd commented 1 year ago

If I remember right when I configured a web-terminal-tooling image to use zsh, it was possible to set the shell by exporting the SHELL env var in the dockerfile, but it has been a while so I could be wrong here.

This would be great. The last time I looked I found that cri-o is patching automatically /etc/passwd and /etc/group ( honoring the image User and the env variable HOME) but the shell is always set to /sbin/nologin. Forcing us to do the patch anyway.

RomanNikitenko commented 1 year ago

Update:

I prepared a UDI image with a custom entrypoint
tested creation of workspace for the VS Code editor without Dashboard usage
I can confirm that the UDI entrypoint is not overridden anymore(and it's executed) if I use the following changes for the description of the editor.

At the same time there is still a problem when I create a workspace from the dashboard. So - I guess:

logic on the DWO side works correctly
we still need some changes on the dashboard side (or required changes for the dashboard didn't come to the dogfooding instance yet...)

I'm going to test similar changes for the JetBrains editor.

RomanNikitenko commented 1 year ago

About changes on the dashboard side (please see my previous comment) - I think resolving this issue should fix the problem.

amisevsk commented 1 year ago

Looking into using alternative shells, che-code, (current latest version) respects the $SHELL environment variable, even with existing /etc/passwd functionality. I've built the following images that are useful for testing:

quay.io/amisevsk/web-terminal-tooling:zsh: Improved WTO tooling image, with zsh installed (other shells are not available in default repos, so just bash and zsh for now)
quay.io/amisevsk/che-code:test: Version of che-code init container that does not modify /etc/passwd on start to add an entry for the current user (in my testing, this does not result in an entry being added by cri-o, but that's maybe just the dogfooding cluster)

To test setting SHELL, you can use this DevWorkspace. It should be applied directly to the cluster and assumes Che/Dev Spaces is installed (i.e. not through the dashboard):

kubectl apply -f https://gist.githubusercontent.com/amisevsk/986b628d26a61427fd2f679ddd414ee1/raw/8ee302f8ee4cece8787809e82229b289257c4521/shell-test.devworkspace.yaml

To summarize, as far as I can tell: If $SHELL env var is set in "dev" container, value of $SHELL is used for in-editor terminal. Otherwise, login shell from /etc/passwd is used. If neither are set, fall back to sh.

In addition, $SHELL is ignored by both the OpenShift console terminal tab (in pod details) and the Web Terminal.

l0rd commented 1 year ago

Thank you @amisevsk this is great news. We should get rid of the patches to the UDI /etc/passwd now! I will create a separate issue.

amisevsk commented 1 year ago

When I tested it against OpenShift 4.11/4.12, I didn't get the cri-o injected entry in /etc/passwd, so before removing the functionality entirely, it may be worth verifying that we don't run into the other class of issues with not having the entry (configured $HOME directory in containers that do not set $HOME env var, issues with not having a whoami entry, etc.).

However, if UDI images installed alternate shells (e.g. zsh), we could enable switching shells by just setting an environment variable in the devfile, which would be a nice bonus. I'll create an issue to follow up here, as we're getting off-topic for this issue.

Edit: Created issues to move discussion out of this thread:

RomanNikitenko commented 1 year ago

I've faced with a new problem that the plugin-registry is ignoring post-start event for an editor, please see details here. I'm investigating that problem, will update my PR ASAP...

The issue: https://github.com/eclipse/che/issues/22071

RomanNikitenko commented 1 year ago

PRs:

are merged!

azatsarynnyy commented 1 year ago

I noticed one small side-effect of adding the command of exec type in the editor definition:

It's listed in the Tasks of the devfile type.

The best option for hiding it - filter in Che-Code by the well-known ID init-che-code-command.

@RomanNikitenko WDYT about ^^?

I used the test project provided in the PR https://github.com/eclipse-che/che-plugin-registry/pull/1566.

RomanNikitenko commented 1 year ago

@azatsarynnyy thank you for reporting the problem!

About filtering by ID - I don't like hardcoding in general:

filtering by init-che-code-command will work for our yaml files
but we never know what ID user defines in the .che/che-editor.yaml file
even within our devfiles/yaml files there were differences, like: all editors had a command with init-container-command ID, but one of them had a command with init-remote-runtime-injector ID: https://github.com/eclipse-che/che-plugin-registry/pull/1613/commits/619e7e3412415dee6feebf10093d046a2bb61b28

From architecture point of view, I would say, we should have some mechanism/approach how to filter out devfile commands which are not expected to be converted to VS Code tasks.

In this case the command was defined in the .che/che-editor.yaml file, not in the devfile.yaml of the project. My question is: is it expected to convert commands from .che/che-editor.yaml file to VS Code tasks?

azatsarynnyy commented 1 year ago

@RomanNikitenko Theoretically, the user may want to provide an exec command with its custom VS Code editor Devfile. However, I have never seen that in practice. Moreover, I don't see any value in such an approach. So, I'd hide all such editor-specific exec commands.

We can filter them out by the controller.devfile.io/imported-by: editor attribute. See my suggestion in the PR https://github.com/che-incubator/che-code/pull/192. Once the bot builds the images, we can test it on our dogfooding cluster where the plugin registry already contains your latest changes.

azatsarynnyy commented 1 year ago

The editor-contributed Devfile Commands are filtered out now - https://github.com/che-incubator/che-code/pull/192

amisevsk commented 1 year ago

I'm concerned the approach to filtering editor-contributed commands is potentially fragile, as the editor name comes from the devfile registry instead of some other source (as far as I can tell, at least). I left a comment on the PR here

azatsarynnyy commented 1 year ago

Thanks @amisevsk! I implemented your suggestion in https://github.com/che-incubator/che-code/pull/194. Now, it doesn't depend on the controller.devfile.io/imported-by attribute's value.

devstudio-release commented 1 year ago

sync'd to Red Hat JIRA https://issues.redhat.com/browse/CRW-4158

https://github.com/eclipse-che/che-plugin-registry/pull/1566

eclipse-che / che

To start VS Code, use `post-start` rather than overriding the entrypoint #21879

Is your enhancement related to a problem? Please describe

Describe the solution you'd like