microsoft / react-native-code-push

React Native module for CodePush
http://appcenter.ms
Other
8.98k stars 1.47k forks source link

CodePush.updateCheck returns 503 #1952

Closed colarlady closed 4 months ago

colarlady commented 4 years ago

Thanks so much for filing an issue or feature request! Please fill out the following (wherever relevant): Hi,guys When I request an updated interface, it returns 503 who can help me?

juanmaotegui commented 4 years ago

Same here

Service Unavailable

Service Unavailable


HTTP Error 503. The service is unavailable.

OlivierFreyssinet-old commented 4 years ago

Same here this is what I'm seeing on Sentry

POST https://codepush.azurewebsites.net/reportStatus/deploy [503]
GET https://codepush.azurewebsites.net/updateCheck [503]
zafiredevelopment commented 4 years ago

Same here 502 and 503 errors Can someone look into this? its affecting production apps.

ruslan-bikkinin commented 4 years ago

Hello everybody, and sorry for late response.

Recently we had an incident on Azure side that lead to 503 errors when requesting CodePush server. Do you still experiencing such problem?

decadX commented 4 years ago

@ruslan-bikkinin Its still happening for us today (I have sent the curl call to anvesh kasani from the support appcenter team) See the screenshot below, this is the same XHR that the codepush cordova plugin sends and randomly, it will take a back a 503.

image

snowpardx commented 4 years ago

Thanks for you comment, looking further it may be related not only to underlying Azure issue but also to some of backend services recycled. Our team actively looking for root cause and possible mitigations steps.

andrei-m-code commented 4 years ago

We're getting this issue now across all our apps. It happens way too often. We may have to think to move away from code push or investigate ways to host our own code-push server. It's very frustrating :(

snowpardx commented 4 years ago

Apologise for the inconvenience the issue causing, unfortunately no root cause is found yet but we continue our investigation.

Couple of hours ago we deployed the change which may(or may not, we need to gather some statistics to estimate it) mitigate(not fully fix unfortunately) impact by how frequent this happening.

So far I recommend you to stay tuned to this issue where we definitely would post update when we expect full remediation and would ask you for helping us to confirm this.

andrei-m-code commented 4 years ago

@snowpardx absolutely. I'll let you know once it's resolved on my end. Are you guys seeing the issue on your end? And is it under investigation?

snowpardx commented 4 years ago

Yes, we are still investigating this: 1 change which may mitigate this was deployed recently, and in addition we also investigating what could be possible root cause (probably tomorrow CET time, additional telemetry changes could help us to confirm one of the issue would be ready for deployment).

By the way unfortunately right now we don't have understanding on why this is happening, and so have no ETA. Once we learn more, we definitely return back to this issue.

andrei-m-code commented 4 years ago

@snowpardx we've been having issues with code-push throughout the day. We've checked recently and seems like all the issues are gone now. The code-push is now working as expected for us. We're happy that everything is resolved, would be good to know the root cause to avoid it in the future. Thanks!

VrajSolanki commented 4 years ago

@snowpardx and Andrei we are still facing this code push issue drastically and can't send builds to production application. Can we have any workaround so that the builds could be sent ? Please suggest

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.01//EN”“http://www.w3.org/TR/html4/strict.dtd”>

Service Unavailable

Service Unavailable


HTTP Error 503. The service is unavailable.

snowpardx commented 4 years ago

Unfortunately we don't see any indication that deployed change helped a lot, I can confirm that error rate is somewhat about the same as before, so issue persists for some part of requests. image But it definitely appear only for small part of requests, so some subsequent retries would probably succeed.

Right now, we don't understand the root cause of the issue (it just started to happen at some moment, without directly related code changes). So there is no workaround available right now (except of retrying in some time).

Thanks for bearing with us while we investigating what's going on.

VrajSolanki commented 4 years ago

Thank you @snowpardx , I understand, just hope the team is actively working on this one. Facing too much heat from customers about the failing code updates. Will wait for an update from you. 👍

andrei-m-code commented 4 years ago

@snowpardx @VrajSolanki I can confirm that the issue is back this morning. It was working well last night which makes me think, if it's simply a high load issue.

Are you guys able to scale out code-push services?

decadX commented 4 years ago

At the moment its 80-90% fail rate; Maybe try and restart the blades on azure? If that fixes it, maybe setup Azure Autoheal to detect 503 and restart the instance?

tedkimzikto commented 4 years ago

We are running two apps in service. However, we only find the issue in the only one of our apps. Another one works just fine. Our app heavily depends on the update check code, which make whole app unserviceable. We just deployed new application which can run with the 503 error. Hope the issue gets fixed soon. Let me know if you need more details of packages or settings.

VrajSolanki commented 4 years ago

@tedkimzikto I am currently sending the code push to check. Still the same issue persists.

  1. the 503 error
  2. check code push coming up too late for the app, because the api is not responding. I would say it has the same fail rate even now.
andrei-m-code commented 4 years ago

@VrajSolanki yes, it's not working at all. It's hurting our business badly, especially considering it's been down for couple days now. Customers are seeing an older app version because updates are not coming in...

tedkimzikto commented 4 years ago

deploy new version to the playstore and app store for now, and hope for the best :0

andrei-m-code commented 4 years ago

@tedkimzikto :) easy to say, we have about 100 apps that we'd need to deploy.

tedkimzikto commented 4 years ago

https://status.appcenter.ms went back to normal, is it relative to the issue?

andrei-m-code commented 4 years ago

@tedkimzikto it was green all day yesterday. It doesn't reflect the real status of the service as far as I can see

decadX commented 4 years ago

https://status.appcenter.ms went back to normal, is it relative to the issue?

easy to check, https://codepush.azurewebsites.net/updateCheck?deploymentKey=&appVersion=9.2.10&packageHash=&isCompanion=false&label=v145&clientUniqueId=1123412341234

VrajSolanki commented 4 years ago

@tedkimzikto @andrei-m-code even if we push the app to app store it is going to take couple of days for the apps to be live, until then all the applications would break. Can we have a hotfix or workaround so that we can send a hotfix and then freeze the use of code push until it is fixed ? @tedkimzikto as you suggested on app is working, won't the same setup solve for this one too ? @snowpardx Want to know your thoughts here

snowpardx commented 4 years ago

@VrajSolanki to be honest I don't think that it going to be fixed in short frame like couple of days, given the fact we don't understand the underlying issue, so no ETA so far.

Interesting thing here is that with url you provided I'm able to reproduce the issue way more frequent that "around 5% of total requests" we seeing in the telemetry, while it doesn't repro at all for the app I created by my own.

VrajSolanki commented 4 years ago

@snowpardx can we get the details from your test app ? maybe some miss on our end for the setups ? that can be a good starting point, for us to try and fix it ?

filipemarruda commented 4 years ago

Same issue here. Testing using the @decadX suggest. Some evolves from App Center team?

andrei-m-code commented 4 years ago

@snowpardx The issue is clearly periodic. Right now code-push works very well for me. If the issues start to happen again tomorrow morning, it would support this theory. Therefore I believe the periodicity of the issue happening could only depend on the load, unless you guys do something funky with the services every day during business hours :) ?

Possible solution: scale the services out and see if it fixes the issue. As you're hosted on Azure, you must have AppInsights connected to this app service? What is memory and CPU use? Is the service running hot between 8am-6pm EST? I'm sure you guys have already checked it. Regardless I'd still try to scale the services and see if it fixes the issue.

We heavily rely on code-push and really need it to be fixed. I'm happy to jump on a call and show you what we experience or run some tests from my app as needed. Please feel free to reach out.

manjubasha commented 4 years ago

Hi, Same issue here. All our production updatechecks are failing and no one is able to install the build. Does anyone have any update on when the issue will be resolved?

ajaykpaul commented 4 years ago

@snowpardx We are heavily relied on code push and need it to be fixed. How can we help here?

parintorn1902 commented 4 years ago

Same issue here. but not all of our apps , some apps still working normally.

snowpardx commented 4 years ago

We found that issue happening only for old SDK, which uses https://codepush.azurewebsites.net/ service, new SDK(which using https://codepush.appcenter.ms) have no issues (at least at the current moment), the backend services themselves are healthy and able to process the traffic.

Right now we are looking into possible root cause https://codepush.azurewebsites.net/ unable to work with backend services which effectively means that unfortunately we still have no ETA on when it going to be fixed.

So now I advice those who experience the issue and can deploy new version to store, to switch to most recent SDK so that works with new gateway and API which doesn't affected.

Being wrote that, I also would like to clarify, this doesn't mean that we going to just stop support previous SDKs right now (at some moment of time they get retired though, as this normally happens while software evolves) our plan now is to get https://codepush.azurewebsites.net/ back.

sanjaypojo commented 4 years ago

@snowpardx if we're using react-native-code-push@6.3.0 does this mean we're on the latest SDK? If we are, we are continuing to experience issues. Specifically, we had about 21k+ yesterday and 39k+ errors today. Please let us know if there's anything we can do about this, we're losing a lot of users due to this

snowpardx commented 4 years ago

@sanjaypojo thanks for the additional info, starting react-native-code-push v5.7.0 requests should go to another URL. Which means now both frontends are affected 😞

danut-t commented 4 years ago

hello,

I'm also seeing this for the new url as well, as you can see in the screenshot below: image

libraries used:

"appcenter": "3.1.1", "appcenter-analytics": "3.1.1", "appcenter-crashes": "3.1.1", "react-native-code-push": "6.3.0"

Note that this does not happen as often as the failures mentioned above, which target the older url, but it still occurs once in 4-5 hits. Unfortunately that does impact severely our users as well. I also think that there are certain periods of time when it occurs more often, probably when there is more intense traffic on the servers.

Also getting long running requests that take either more than a minute or end up being timed out: image

sanjaypojo commented 4 years ago

@snowpardx we've implemented retries and we're getting users retrying the API up to 70 times and receiving 503 errors every time.

This is independent of users who timeout (we have a 30 second timeout on the request). There has to be something you can do about this outage? And since we're getting responses rapidly from MS, I'm assuming that this has to be logged somewhere server side? We're really struggling here and don't have a workaround right now.

sanjaypojo commented 4 years ago

The status page says that the issue is only with the legacy API. Could this be updated so that it's confirmed that the outage is on the latest API too? And please let me know if you want any specific details, we're happy to share and help you solve this problem ASAP.

https://status.appcenter.ms

psaikali commented 4 years ago

Hey folks, any ETA on this one? Or any possible action on our side to help fixing this?

jphenow commented 4 years ago

Hi folks. Sincerest apologies for the delay here. After a lot of investigation, I think we've narrowed down our issue to a recent infrastructure change. I believe this has been resolved by a revert of that change in the last hour or so. Please verify this is the case for you, and let us know if the problem persists.

andrei-m-code commented 4 years ago

Hallelujah! @jphenow @snowpardx thank you guys! It seems to be working well for us now. A couple questions:

Again, thanks for keeping us posted and fingers crossed it will all work well from now on :).

sanjaypojo commented 4 years ago

Seems to be working for us as well 😄

And great point @andrei-m-code. I'm also interested in hearing about this! It would give us a lot of confidence in using AppCenter as stability is critical for us:

  • Is there any SLA associated with code-push endpoints? Any service level difference between free and paid accounts?
strong0588 commented 4 years ago

Still getting error 503

tackanoway35 commented 4 years ago

Now still getting error 503

andrei-m-code commented 4 years ago

Yep, 503 again. @snowpardx @jphenow

jphenow commented 4 years ago

Hey folks, sorry about this. We're working on getting better alarms on this kind of thing since we've been investigating this. I'm tuning some things that should resolve the worst of this right now, though we're monitoring to be sure of that.

One thing I meant to note earlier is if folks reach out via support ("?" in top right of appcenter.ms, then "contact support") we tend to be more able to triage and find a pattern of issues on our end.

To your questions earlier:

Would this "legacy" endpoint be available? No plans to shut it down just yet?

I'm not aware of any near-term plans to shut this down, no. I'm working to double check that though. If there are plans while this conversation continues, I will correct this.

It being a legacy API does mean we tend to have less attention on it, and sadly tends to take us more time to investigate issues like this. I would really encourage folks to update to newer APIs whenever possible.

Is there any SLA associated with code-push endpoints? Any service level difference between free and paid accounts?

We don't have a paid plan for CodePush. We do have SLOs (Objectives) which is a general aim for 99.9%.

tackanoway35 commented 4 years ago

Hey folks, sorry about this. We're working on getting better alarms on this kind of thing since we've been investigating this. I'm tuning some things that should resolve the worst of this right now, though we're monitoring to be sure of that.

One thing I meant to note earlier is if folks reach out via support ("?" in top right of appcenter.ms, then "contact support") we tend to be more able to triage and find a pattern of issues on our end.

To your questions earlier:

Would this "legacy" endpoint be available? No plans to shut it down just yet?

I'm not aware of any near-term plans to shut this down, no. I'm working to double check that though. If there are plans while this conversation continues, I will correct this.

It being a legacy API does mean we tend to have less attention on it, and sadly tends to take us more time to investigate issues like this. I would really encourage folks to update to newer APIs whenever possible.

Is there any SLA associated with code-push endpoints? Any service level difference between free and paid accounts?

We don't have a paid plan for CodePush. We do have SLOs (Objectives) which is a general aim for 99.9%.

What api version can i use to solve this problem?. I'm using react-native-code-push version 5.4.0

snowpardx commented 4 years ago

@tackanoway35 you may try the latest plugin. Starting v5.7.0 requests are go to different URL.

tackanoway35 commented 4 years ago

@tackanoway35 you may try the latest plugin. Starting v5.7.0 requests are go to different URL.

Thanks! I will try this

sanjaypojo commented 4 years ago

We're still continuing to experience about 3k+ errors per day :(