microsoft / mindaro

Bridge to Kubernetes - for Visual Studio and Visual Studio Code
MIT License
307 stars 106 forks source link

Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero #198

Closed dsavchenko closed 3 years ago

dsavchenko commented 3 years ago

Describe the bug Not connecting to k8s, failing with mentioned exception

To Reproduce Steps to follow to reproduce this issue. Nothing special just trying to use bridge to k8s

**Expected behavior** It should connect to the service **Logs** [mindaro-vscode-2021-07-07T13-16-55.388Z.txt](https://github.com/microsoft/mindaro/files/6777438/mindaro-vscode-2021-07-07T13-16-55.388Z.txt) **Environment Details** Client used (VS Code/Visual Studio): vscode Client's version: 1.57.1 Operating System: Arch Linux Extension version: mindaro.mindaro@1.0.120210702
richardsondx commented 3 years ago

+1 Having the same issue

Environment Details Client used (VS Code/Visual Studio): vscode

Extension version: v1.0.120210702

Operating System: MacOS BIg Sur 11.2.3 – M1 chip

Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubernetesClient.<>c__DisplayClass73_0.<GetContainerEnvironment>b__0(String output)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubectlImpl.<>c__DisplayClass9_1.<RunShortRunningCommand>b__1(Object sender, DataReceivedEventArgs e)
   at Microsoft.BridgeToKubernetes.Common.IO.ProcessEx.<.ctor>b__1_0(Object o, DataReceivedEventArgs e)
   at System.Diagnostics.Process.OutputReadNotifyUser(String data)
   at System.Diagnostics.AsyncStreamReader.FlushMessageQueue(Boolean rethrowInNewThread)
--- End of stack trace from previous location where exception was thrown ---
   at System.Diagnostics.AsyncStreamReader.<>c.<FlushMessageQueue>b__18_0(Object edi)
   at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
   at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
Failed to establish a connection. Error: Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubernetesClient.<>c__DisplayClass73_0.<GetContainerEnvironment>b__0(String output)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubectlImpl.<>c__DisplayClass9_1.<RunShortRunningCommand>b__1(Object sender, DataReceivedEventArgs e)
   at Microsoft.BridgeToKubernetes.Common.IO.ProcessEx.<.ctor>b__1_0(Object o, DataReceivedEventArgs e)
   at System.Diagnostics.Process.OutputReadNotifyUser(String data)
   at System.Diagnostics.AsyncStreamReader.FlushMessageQueue(Boolean rethrowInNewThread)
--- End of stack trace from previous location where exception was thrown ---
   at System.Diagnostics.AsyncStreamReader.<>c.<FlushMessageQueue>b__18_0(Object edi)
   at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
   at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()

Edit: I updated the stacktrace with the full stacktrace.

Also noticed this error from one of the dependencies ( not sure if its related to the issue above)

Incorrect type. Expected "string". 

Complaining about a null value when it's trying to read the pod yaml file.

pod-123.yml

status:
  conditions:
  - lastProbeTime: null
daniv-msft commented 3 years ago

Thank you @dsavchenko and @richardsondx for reporting this issue! This is a bad bug, sorry about that.

Looking at our code, the ArgumentOutOfRangeException is due to us assuming that the environment variables we read from the container will always have a value, and that we could split the string on character "=". It seems that this is not the case for you, and that for at least one of the environment variables in your container, there is no value.

Could you please confirm this by running the command kubectl exec <NAME_OF_YOUR_POD> -c <NAME_OF_YOUR_CONTAINER> -- env, and validate that not all of the returned environment variables contain "="? I'm working on a fix in parallel, and we'll ship it in our next release. If you're willing to try beta bits of our VS Code extension, I can share them with you as well to unblock you quicker.

@richardsondx I believe the second issue you mention is independent. It doesn't seem familiar to me, but I'm adding @pragyamehta who worked on probes support and might have more details to share. @pragyamehta, does this ring a bell for you? Do we expect this field to have a specific value? I couldn't find references to it in our codebase.

richardsondx commented 3 years ago

Thanks for the prompt reply @daniv-msft ! 🙏

And yes I'm willing to test the beta versions with a fix if that can unblock me quicker. That would be great thanks! Please provide the instructions for installing the beta when ready.

daniv-msft commented 3 years ago

Thanks @richardsondx, that's very helpful! From the code, we already handle the case of an environment variable containing multiple "=" characters, so I'm surprised you get this issue with no env variables missing values. One possibility is that if you have some environment variables with a long value (might be the case of the one mentioned above), we don't parse them correctly, resulting in this bug.

I sent a fix to ignore environment variables that we cannot parse, and I expect I'll be able to share beta bits with you in the coming days.

richardsondx commented 3 years ago

I forgot to mention that I see this message in a popup when the error happens: The preLaunchTask 'bridge-to-kubernetes.resource' terminated with exit code 1.

and the error happens after these steps:

Using kubernetes service environment variables: false

Retrieving the current context and credentials...
Validating the credentials to access the cluster...
Validating the requirements to replicate resources locally...
Redirecting traffic from the cluster to your machine...
Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
...
dsavchenko commented 3 years ago

Thanks for the reply @daniv-msft I did changes to the deployment so now I can't replicate the issue. What I can confirm is that there are no suspicious env vars now and bridge connects properly I also tried to manually add empty variable (looks FOO= in the env output) and it still works with it

henposs commented 3 years ago

Hi! I have a similar problem when I use bridge now. Bridging same service did work fine a week ago with same setup.

Error: Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length') at System.String.Substring(Int32 startIndex, Int32 length) at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubernetesClient.<>cDisplayClass73_0.b0(String output) at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubectlImpl.<>cDisplayClass9_1.b1(Object sender, DataReceivedEventArgs e) at Microsoft.BridgeToKubernetes.Common.IO.ProcessEx.<.ctor>b1_0(Object o, DataReceivedEventArgs e) at System.Diagnostics.Process.OutputReadNotifyUser(String data) at System.Diagnostics.AsyncStreamReader.FlushMessageQueue(Boolean rethrowInNewThread) --- End of stack trace from previous location where exception was thrown --- at System.Diagnostics.AsyncStreamReader.<>c.b__18_0(Object edi) at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b6_0(QueueUserWorkItemCallback quwi) at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state) at System.Threading.QueueUserWorkItemCallback.Execute() at System.Threading.ThreadPoolWorkQueue.Dispatch() at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()


When I running kubectl exec -c -- env I have some variables that looks like this that perhaps cause the problem for me:

GENERAL_CONFIG=production_environment: false default_authprofile: https://... BRIDGE_COLLECT_TELEMETRY=True

GENERAL_CONFIG is deployed in a configmap like this:

general-config.yaml: |-
    production_environment: false
    default_authprofile: https://...

Besides from that I don't have any variables that are empty.

Omkar-Shetkar commented 3 years ago

I see the same issue in my environment based on k3s. I do have one big JSON as an environment variable value. Happy to test the fix. Thanks.

daniv-msft commented 3 years ago

@richardsondx, @dsavchenko, @henposs, @Omkar-Shetkar Thanks everyone for your replies. I have the mitigation fix ready in dev (basically, ignoring environment variables if we can't parse them), and we're going to fast track it to production so that we can unblock you quicker without asking you to test staging bits. We would like to release this today or tomorrow, and we will let you know once we're done.

daniv-msft commented 3 years ago

@richardsondx, @dsavchenko, @henposs, @Omkar-Shetkar We just released a new version of our VS Code extension with the hotfix for this. Could you please give it a try and validate it fixes the issue for you?

richardsondx commented 3 years ago

@daniv-msft I upgraded to the newer versions and I can confirm that the new version fixes the bug. Thank you for such a quick turnaround!

I'm still having a separate issue running the debugger but that's more command related on how I'm starting the rails server.

👍 Thanks!

ajaffie commented 3 years ago

I first noticed this issue today (though I have not used it for several days) and I have the latest version installed:

Bridge to Kubernetes initialization (VS Code v1.58.0 - Extension v1.0.120210708)
Logs: c:\Users\AndrewJ\AppData\Roaming\Code\logs\20210708T165422\exthost5\mindaro.mindaro\mindaro-vscode-2021-07-08T21-40-26.444Z.txt
Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubernetesClient.<>c__DisplayClass73_0.<GetContainerEnvironment>b__0(String output)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubectlImpl.<>c__DisplayClass9_1.<RunShortRunningCommand>b__1(Object sender, DataReceivedEventArgs e)
   at Microsoft.BridgeToKubernetes.Common.IO.ProcessEx.<.ctor>b__1_0(Object o, DataReceivedEventArgs e)
   at System.Diagnostics.Process.OutputReadNotifyUser(String data)
   at System.Diagnostics.AsyncStreamReader.FlushMessageQueue(Boolean rethrowInNewThread)
--- End of stack trace from previous location where exception was thrown ---
   at System.Diagnostics.AsyncStreamReader.<>c.<FlushMessageQueue>b__18_0(Object edi)
   at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
   at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
Failed to establish a connection. Error: Unhandled exception. System.ArgumentOutOfRangeException: Length cannot be less than zero. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubernetesClient.<>c__DisplayClass73_0.<GetContainerEnvironment>b__0(String output)
   at Microsoft.BridgeToKubernetes.Common.Kubernetes.KubectlImpl.<>c__DisplayClass9_1.<RunShortRunningCommand>b__1(Object sender, DataReceivedEventArgs e)
   at Microsoft.BridgeToKubernetes.Common.IO.ProcessEx.<.ctor>b__1_0(Object o, DataReceivedEventArgs e)
   at System.Diagnostics.Process.OutputReadNotifyUser(String data)
   at System.Diagnostics.AsyncStreamReader.FlushMessageQueue(Boolean rethrowInNewThread)
--- End of stack trace from previous location where exception was thrown ---
   at System.Diagnostics.AsyncStreamReader.<>c.<FlushMessageQueue>b__18_0(Object edi)
   at System.Threading.QueueUserWorkItemCallback.<>c.<.cctor>b__6_0(QueueUserWorkItemCallback quwi)
   at System.Threading.ExecutionContext.RunForThreadPoolUnsafe[TState](ExecutionContext executionContext, Action`1 callback, TState& state)
   at System.Threading.QueueUserWorkItemCallback.Execute()
   at System.Threading.ThreadPoolWorkQueue.Dispatch()
   at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
daniv-msft commented 3 years ago

@richardsondx Thanks for validating the fix worked for you!

@ajaffie It seems that your extension is still using the previous binaries that don't contain my hotfix for this issue. Would it be possible for you to send the logs in c:\Users\AndrewJ\AppData\Roaming\Code\logs\20210708T165422\exthost5\mindaro.mindaro\mindaro-vscode-2021-07-08T21-40-26.444Z.txt? In the logs, we will have the information of what version was assigned to your machine by our ExP A/B service (should be 1.0.20210708.15), and what version you have locally.

ajaffie commented 3 years ago

Here are the first few lines which seem to have the version:

2021-07-08T21:40:26.454Z | Common Extension Root    | TRACE | Call to ExP returned version 'undefined' <json>{}</json>
2021-07-08T21:40:26.454Z | Common Extension Root    | TRACE | Resolved expected CLI version '1.0.20210615.1' <json>{}</json>
2021-07-08T21:40:26.454Z | Common Extension Root    | TRACE | Event: binaries-utility-version-v1 <json>{}</json>
2021-07-08T21:40:26.455Z | Common Extension Root    | TRACE | Trying to initialize the workspace folder Job for Connect <json>{}</json>
2021-07-08T21:40:26.455Z | Common Extension Root    | TRACE | Connect initialization started on Job <json>{}</json>
2021-07-08T21:40:26.455Z | Common Extension Root    | TRACE | Event: connect-initialization-success <json>{"workspacesCommonId":"ad08becc-86db-59c5-cad0-52fdfcdef41d"}</json>
2021-07-08T21:40:26.455Z | Common Extension Root    | TRACE | Making sure that the CLI is present locally, by downloading it if needed <json>{}</json>
2021-07-08T21:40:26.456Z | Common Extension Root    | TRACE | Making sure that the CLI is present locally, by downloading it if needed <json>{}</json>
2021-07-08T21:40:26.457Z | Common Extension Root    | TRACE | Successfully determined the CLI binary path: c:\Users\AndrewJ\AppData\Roaming\Code\User\globalStorage\mindaro.mindaro\file-downloader-downloads\binaries\dsc.exe <json>{}</json>
2021-07-08T21:40:26.458Z | Common Extension Root    | TRACE | Successfully determined the kubectl binary path: c:\Users\AndrewJ\AppData\Roaming\Code\User\globalStorage\mindaro.mindaro\file-downloader-downloads\binaries\kubectl\win\kubectl.exe <json>{}</json>
2021-07-08T21:40:27.582Z | Common Extension Root    | TRACE | Event: cli-client-get-version-success <json>{}</json>
2021-07-08T21:40:27.582Z | Common Extension Root    | TRACE | Found local CLI version: '1.0.20210615.1'. Expected version: '1.0.20210615.1' <json>{}</json>
2021-07-08T21:40:27.582Z | Common Extension Root    | TRACE | Local CLI has version number '1.0.20210615.1', which is equal to the required '1.0.20210615.1' <json>{}</json>
daniv-msft commented 3 years ago

Thanks @ajaffie! The first line is where the issue happens: Call to ExP returned version 'undefined' <json>{}</json>. You are not getting any results when we call our ExP A/B testing service, and so you don't download our newer version with the hotfix. It is possible that restarting all your VS Code instances is enough to retrigger the call and get the right version. I'm working on pushing an updated VS Code version forcing the new version. Thanks for your patience!

daniv-msft commented 3 years ago

@ajaffie We just released the version 1.0.120210709 of our VS Code extension that makes sure the binaries will be downloaded properly. Could you please update your extension and let us know how it looks?

ajaffie commented 3 years ago

@daniv-msft thanks for the help! It still failed to get a version from ExP, but it did download the correct binaries and I can run services again.

daniv-msft commented 3 years ago

Thanks @ajaffie for confirming! Everyone > I'm closing the issue as it should now be fixed for all users. If you can still reproduce it, please don't hesitate to add a comment below.

Omkar-Shetkar commented 3 years ago

With this fix, I am not getting earlier issue of System.ArgumentOutOfRangeException: Length cannot be less than zero.. However, when the tool tries to start the service, it fails to parse the environment variables. When I debugged, it seems an environment variable having big JSON value is not parsed correctly.

JSON environment variable in helm chart:

 - name: ENV1
          value: >
            {
               "key1" : {},
                ...
            }
  - name: ENV2
    value: value1

Command used to launch service:

$  /usr/bin/env GRPC_DNS_RESOLVER=native KUBERNETES_SERVICE_HOST=10.43.0.1 KUBERNETES_PORT_443_TCP=tcp://10.43.0.1:443 KUBERNETES_POR
T_443_TCP_PROTO=tcp KUBERNETES_PORT_443_TCP_PORT=443 KUBERNETES_PORT_443_TCP_ADDR=10.43.0.1 KUBERNETES_SERVICE_PORT_HTTPS=443 KUBERNE
TES_PORT=tcp://10.43.0.1:443 KUBERNETES_SERVICE_PORT=443 'JAVA_OPTS=-Dfile.encoding=UTF-8 -Dsun.jnu.encoding=UTF-8'  'ENV1={' ENV2=value1

As you can see, ENV1 seems to have skipped its value and considering next environment variables as its value.

Please let me know if you need anymore information. Thanks.

daniv-msft commented 3 years ago

@Omkar-Shetkar Thanks, that confirms the assumption we had about this issue being related to big env variables values. We'll push a fix for that, but it will likely only come with our next release in a couple of weeks. Is this a blocker for you?

Omkar-Shetkar commented 3 years ago

Thanks, @daniv-msft . I wanted to introduce this tool to the team so that we can optimize development time. Unfortunately, most of the services use big JSON like above. Hence, may not be able to use this tool until it's fixed. Thanks.

daniv-msft commented 3 years ago

Thanks @Omkar-Shetkar for clarifying. We're working on a fix, and hope to be able to unblock you soon!

Omkar-Shetkar commented 3 years ago

Any updates on the fix, please? Thanks.

daniv-msft commented 3 years ago

Thanks @Omkar-Shetkar for following up. We didn't make as much progress as we wanted on this because we're focusing on our current release. Once this release is out, we should be able to provide beta bits to you with a potential fix.

In the meantime, would declaring the environment variables you want to use in our KubernetesLocalProcessConfig.yaml file be an option? https://docs.microsoft.com/visualstudio/containers/configure-bridge-to-kubernetes#create-an-environment-variable-with-a-constant-value

By setting the environment variable here, we will get it when starting the connection and inject it in the debugged process. Please note however that you might need to adapt slightly your code for this option, as if I remember correctly the env variables coming from the cluster take precedence over the ones declared this way.

Omkar-Shetkar commented 3 years ago

Thanks, @daniv-msft. In my case, large environment variable values are a bit involved with dependency over other components. I think will wait for a beta fix for this issue. Thanks.