MicrosoftPremier / VstsExtensions

Documentation and issue tracking for Microsoft Premier Services Visual Studio Team Services Extensions
MIT License
59 stars 14 forks source link

Recent builds using BQC task in Azure pipelines failing with [ERROR] read ECONNRESET #131

Closed AnushaCha closed 3 years ago

AnushaCha commented 3 years ago

Hi Team,

We have been using Build Quality check task of version 7.6.2 in our Azure pipelines. Since 15 days we are observing build failures with BQC task with error read ECONNRESET This behaviour is inconsistent. Some times build pass with no changes. Kindly help. Let me know if you need any further information to debug the issue.

Thanks, Anusha Challagali

ReneSchumacher commented 3 years ago

Hi Anusha,

sorry for the late response. Can you tell me if this is on Azure DevOps Services (cloud) or Server (on-premises)? And if it's cloud, are you using a private agent or one of our cloud-hosted agents?

AnushaCha commented 3 years ago

Hi Team,

We have been using Build Quality check task of version 7.6.2 in our Azure pipelines. Since 15 days we are observing build failures with BQC task with error read ECONNRESET This behaviour is inconsistent. Some times build pass with no changes. Kindly help. Let me know if you need any further information to debug the issue.

Thanks, Anusha Challagali

Hi Anusha,

sorry for the late response. Can you tell me if this is on Azure DevOps Services (cloud) or Server (on-premises)? And if it's cloud, are you using a private agent or one of our cloud-hosted agents?

HI Rene,

Ours is an Azure Devops server and we are using self hosted agent. I have set proxy environment variables.. The build failure is inconsistent...

ReneSchumacher commented 3 years ago

Hi again,

this might be a bit harder to diagnose as the issue does not come from our task but the underlying azure-devops-node-api library and most probably even from the typed-rest-client library that is used by the azure-devops-node-api library.

Could you please add the variable NODE_DEBUG to your pipeline and set its value to http? This enables lower-level logging in the typed-rest-client lib and should give us an idea about what is happening during the communication between agent and server. If you hit the issue again, please send me the log file from the BQC task to PSGerExtSupport@microsoft.com so I can take a look.

AnushaCha commented 3 years ago

Hi again,

this might be a bit harder to diagnose as the issue does not come from our task but the underlying azure-devops-node-api library and most probably even from the typed-rest-client library that is used by the azure-devops-node-api library.

Could you please add the variable NODE_DEBUG to your pipeline and set its value to http? This enables lower-level logging in the typed-rest-client lib and should give us an idea about what is happening during the communication between agent and server. If you hit the issue again, please send me the log file from the BQC task to PSGerExtSupport@microsoft.com so I can take a look.

Hi Rene,

I have added the variable as you suggested. Below is the screenshot from the log when the pipeline got failed. image

Thanks, Anusha Challagali

ReneSchumacher commented 3 years ago

Hi Anusha,

the screenshot does not provide enough context for the issue. Could you send the full log file to PSGerExtSupport@microsoft.com? If you don't want to share the host information, just replace it with some masking string (e.g., XXX).

ksstott commented 3 years ago

I've started seeing this on some of our builds below attached is the full log with debugging logs messages: 23.txt We're just using Microsoft hosted agents rather than a self hosted agent.

AnushaCha commented 3 years ago

Hi Anusha,

the screenshot does not provide enough context for the issue. Could you send the full log file to PSGerExtSupport@microsoft.com? If you don't want to share the host information, just replace it with some masking string (e.g., XXX).

Hi Rene,

Please find the attached log to debug. I see Kevin's issue is also similar to mine. But we are using self hosted agent from our side.

Thanks, Anusha Challagali Log.docx

AnushaCha commented 3 years ago

Hi Rene,

Any update on this

ReneSchumacher commented 3 years ago

Hi @AnushaCha,

sorry for the delay, I'm currently pretty busy delivering customer workshops and didn't have enough time to go through all the details. So far, I'm not sure what might be causing the connection issue, esp. since in your case an on-prem server is affected, while @ksstott seems to have the same issue in the cloud. I can only guess that it might have to do with the version of the azure-devops-node-api we're using in the task.

Could you perhaps try updating to the latest version of the task and check if this is also affected? I haven't seen this happening internally, which is strange. I'll have more time on Friday to dig into this deeper and will update this issue accordingly.

AnushaCha commented 3 years ago

Hi @AnushaCha,

sorry for the delay, I'm currently pretty busy delivering customer workshops and didn't have enough time to go through all the details. So far, I'm not sure what might be causing the connection issue, esp. since in your case an on-prem server is affected, while @ksstott seems to have the same issue in the cloud. I can only guess that it might have to do with the version of the azure-devops-node-api we're using in the task.

Could you perhaps try updating to the latest version of the task and check if this is also affected? I haven't seen this happening internally, which is strange. I'll have more time on Friday to dig into this deeper and will update this issue accordingly.

Hi Rene,

We have updated our BQC task to latest version 8.0.2 but still the build fails with ECONNRESET error.

Thanks, Anusha Challagali

ReneSchumacher commented 3 years ago

Hi again,

I haven't been able to reproduce the issue so far and, to be honest, don't have a clue what might be causing it. We recently released v8.0.3 of the Build Quality Checks task, which uses an updated version of the azure-devops-node-api that is used to connect back to Azure DevOps. Could you please check if the issue still occurs with this version?

If you are both using proxy servers, would you be able to create one agent that can directly connect to Azure DevOps? I can only think of something related to the proxy that might sporadically block communication.

ksstott commented 3 years ago

Hi again,

I haven't been able to reproduce the issue so far and, to be honest, don't have a clue what might be causing it. We recently released v8.0.3 of the Build Quality Checks task, which uses an updated version of the azure-devops-node-api that is used to connect back to Azure DevOps. Could you please check if the issue still occurs with this version?

If you are both using proxy servers, would you be able to create one agent that can directly connect to Azure DevOps? I can only think of something related to the proxy that might sporadically block communication.

I've updated the task to version 8 in our repo and will see if the issue goes away. Although we've not seen the issue in quite some time now anyway, perhaps it was just something happening on the DevOps end of things rather than the code for this Task.

We are not using proxy servers as far as I am aware. Thanks for looking into this

AnushaCha commented 3 years ago

HI Rene,

I have upgraded the task to latest version when you asked me to. But I still observed build failures even after upgrading the task. However I haven't monitored the build since last week. Let me monitor the build for couple more days and come back to you.

Hi Kelvin,

Have you observed the task failure after upgrading the task to latest version?

Thanks, Anusha

AnushaCha commented 3 years ago

HI Rene,

I have upgraded the task to latest version when you asked me to. But I still observed build failures even after upgrading the task. However I haven't monitored the build since last week. Let me monitor the build for couple more days and come back to you.

Hi Kelvin,

Have you observed the task failure after upgrading the task to latest version?

Thanks, Anusha

Hi @ReneSchumacher,

Any update on this.

Thanks, Anusha Challagali

ReneSchumacher commented 3 years ago

Hi @AnushaCha,

I was waiting for your feedback since you mentioned that you didn't monitor your builds for a while. To be honest, I currently don't know where the connection issue might come from and I haven't heard similar issues from anyone else, neither internally nor externally.

I can only suspect that the error cause might be in your infrastructure. As you're using Team Foundation Server, I would probably start further analysis by looking at the IIS logs. Perhaps they contain some information about the error. In addition, I would look at the event log and check if TFS logged any issue (Application log) or http.sys logged issues (System log).

The final option for diagnosing the issue would be to collect a network trace with some tool like Wireshark and analyze the communication between agent and server.

I really don't think the issue is directly caused by our task, because the connection reset is initiated by the server.

ReneSchumacher commented 3 years ago

Hi again!

Do you have any additional information/update regarding this issue?

ReneSchumacher commented 3 years ago

Closing due to lack of information and probably not caused by the task itself. If the issue still persists, please add additional comments.

AnushaCha commented 3 years ago

Hi Rene,

Hope you are doing good. We are still struggling with the BQC task failure in our ADO pipeline. Now the BQC task fails with error Cannot read property 'records' of null Could you please help me out. Let me know any other information needed.

Thanks, Anusha Challagali