hmrc / vat-api

Apache License 2.0
66 stars 17 forks source link

500 response on submission API #887

Open charles-f opened 2 years ago

charles-f commented 2 years ago

API

Describe the bug Hi - we're seeing a few 500 responses on the submission API, but the submission is actually going through as the obligation itself becomes fulfilled with a received date that matches the submission date. However since the API for the submission returned 500 - we don't have the receipt and correlation id for that submission.

To Reproduce This is intermittent - but it's simply submitting through the API

Expected behavior 201 response along with the response headers and body

Additional context In the response body - I've noted it returns a HTML error response - which includes an ID - maybe this error id means something to you?

I'm trying to find a pattern but it just seems random.

Live environment

DDCTLS-DEV-TEAM commented 2 years ago

Good Morning @charles-f Thank you for raising this one. Do you have the error ID? and We can take a look and see if we can de-bug this one for you. The intermittent part is the difficult part but we will try and find a solution. Kind Regards

charles-f commented 2 years ago

Good morning @DDCTLS-DEV-TEAM

the response on the API received was this. Which is an error ID - This exception has been logged with id 7mgib5j5i


<!DOCTYPE html> Error

Oops, an error occurred

This exception has been logged with id 7mgib5j5i.


There are more but this is an one of them. Can you please let me know how we can get the submission receipt and correlation IDs for that submission.

Can you expand the VAT returns API endpoint to include this information, then at least we have a way to retrieve that even if there is a failure in that original submission?

DDCTLS-DEV-TEAM commented 2 years ago

We have done some digging into the code as well as Kibana Logs. There are three stages that we have observed:

Stage 1: The actual submission at “January 31st 2022, 17:42:00.695” that resulted in Timeout Exception from the Downstream service (DES). Unfortunately we could not log any meaningful information at this stage as its Timeout exception.

Stage 2: The subsequent tries made in the next couple of minutes resulted in error: “The VAT return was already submitted for the given period”. This positively indicates that the submission made in Stage-1 was indeed successful.

Stage 3: The subsequent attempts to retrieveObligations are all returning the error: “The remote endpoint has indicated that no data can be found”. (This might also means a positive response as the data needed to return is no longer present).

We can return or log ‘submission receipt’ and ‘correlation ID’ only when the Downstream returns success response. Its not possible to return them from VAT-API to the caller unless the Downstream gives them back to VAT-API. To understand exactly with what’s happening with the VRN in question (we have the VRN) and to get the submission receipt and correlation IDs for that submission, we are going to write to DES Support, but it will take a while to get the response back from them. We can update when we have response from DES Support.

aeverest commented 2 years ago

Hi, any update please? Thanks.

DDCTLS-DEV-TEAM commented 2 years ago

Hi, we have provided all the information requested by DES Live Support team. They came back and asked if the VAT Regs we provided are right. We have confirmed they were correct. We are still waiting in the response from them.

aeverest commented 2 years ago

Hi, any update please? Thanks.

DDCTLS-DEV-TEAM commented 2 years ago

Hello there, We haven’t heard anything from DES Support. We will ping them again and let them know we are waiting for their response.

Thanks & Regards Mohan Scala Developer DDCLS, Telford

On 24 Mar 2022, at 15:28, 'Adrian' via BTA Guardians @.***> wrote:

Hi, any update please? Thanks.

— Reply to this email directly, view it on GitHub https://github.com/hmrc/vat-api/issues/887#issuecomment-1077756571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWQTAJOSY4NAGTPSTRZOKOTVBSC3NANCNFSM5NRU4VOQ. You are receiving this because you were mentioned.

charles-f commented 2 years ago

Hi, Any further updates - it's now been a while and we're keen to get some constructive feedback. And we're approaching a new submission window where I think it will happen again.

We've seen this on pretty much every quarterly filing window - 500 http code responses on submission - though I don't know if they are all the same.

In 2022 alone - I've seen 14 submission failures with 500 http code responses. In December 2021 - another 11. List goes on.

Thanks, Charles

charles-f commented 2 years ago

Hi, Any further updates - it's now been a while and we're keen to get some constructive feedback. And we're approaching a new submission window where I think it will happen again.

We've seen this on pretty much every quarterly filing window - 500 http code responses on submission - though I don't know if they are all the same.

In 2022 alone - I've seen 14 submission failures with 500 http code responses. In December 2021 - another 11. List goes on.

Thanks, Charles

With the new deadline now passed, we did see 12 more occurrences of 500 responses on submissions.

that's 12 attempts across 5 different VRNs. Any update on this situation because it's only happening more and more.

charles-f commented 2 years ago

Hi any update - this still continues to happen.

DDCTLS-DEV-TEAM commented 2 years ago

Hello there, Please can you take sometime to address this issue. We have been chased by Support teams to resolve this issue since Feb, 2022. We surely need a fix for this.

Kind Regards Mohan Dolla | Scala Developer | DDCT Live Services | HM Revenue & Customs | Telford Plaza | Telford & Wrekin | TF3 4NT

On 9 May 2022, at 08:50, Scott Goodwin @.***> wrote:

Good morning Charles F,

Hope you are well.

We are still waiting for a response for DES for support on these errors. Unfortunately, we dont have any visibility of DES or the code involved so we are reliant on their input.

Here is the second email to DES:

Hello there, We have sent you an email some time in the first week of Feb,2022 to investigate this issue and let us know possible ways to prevent this from happening. There are more customers seeing this problem now. Please can you respond as soon as you can. Here it is again:

I am one of the Devs supporting VAT-API service. I am reaching out to see if you can help us with tracking the submissions for VRN: 327450901 on DES. Here I can outline the sequence of events for this VRN:

Stage 1: The actual Returns submission was attempted by VAT-API at “January 31st 2022, 17:42:00.695” that resulted in Timeout Exception from the Downstream service (DES). Unfortunately we could not log any meaningful information at this stage as its Timeout exception.

Stage 2: The subsequent tries made in the next couple of minutes resulted in error: “The VAT return was already submitted for the given period”. This positively indicates that the submission made in Stage-1 was indeed successful.

Stage 3: The subsequent attempts to retrieveObligations for that VRN are all returning the error: “The remote endpoint has indicated that no data can be found”. (This might also means a positive response as the data needed to return is no longer present).

Can you please give us “Submission Receipt” and “correlation ID” for the submissions done for that VRN? We need to pass them to the QA and Customer support teams.

Also we would like to know if its possible: Step(a): to have VAT-API service RETRY the same submission after getting error response at Stage-1? And if we get the error "The VAT return was already submitted for the given period”, we can assume it has gone through? Step(b): for DES service to return “Submission Receipt” and “correlation ID” as part of response in retry at Step(a) above instead of just returning the message?

If you would to push this with them here is their email: @. @.>

Hopefully, you can get us an answer quicker.

Kind Regards

Scott Goodwin Tech Lead

On Mon, 9 May 2022 at 08:43, 'Charles F (QA)' via BTA Guardians @. @.>> wrote:

Hi, Any further updates - it's now been a while and we're keen to get some constructive feedback. And we're approaching a new submission window where I think it will happen again.

We've seen this on pretty much every quarterly filing window - 500 http code responses on submission - though I don't know if they are all the same.

In 2022 alone - I've seen 14 submission failures with 500 http code responses. In December 2021 - another 11. List goes on.

Thanks, Charles

With the new deadline now passed, we did see 12 more occurrences of 500 responses on submissions.

8 on 3rd May 3 on 4th May 1 on 6th May that's 12 attempts across 5 different VRNs. Any update on this situation because it's only happening more and more.

— Reply to this email directly, view it on GitHub https://github.com/hmrc/vat-api/issues/887#issuecomment-1120755831, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWQTAJKWI52GRKBH5SXSUF3VJC6ZJANCNFSM5NRU4VOQ. You are receiving this because you were mentioned.

-- Kind Regards Scott Goodwin | Technical Lead | DDCT Live Services | HM Revenue & Customs | Telford Plaza | Telford & Wrekin | TF3 4NT

karolskrobot commented 2 years ago

Can we have an update please?

We have the same issue occurring for a few hundred submissions each month - 500 responses followed by inconsistent replies from vat return and obligations endpoints. We reported this via the SDS team with detailed data as requested, and we've been waiting for any response since 10 June.

As additional note, returning he HTML response is a bug in itself. No content-type header is set so it defaults to application/json. We would expect the body to be JSON like any other error response.

charles-f commented 2 years ago

Hi this is still happening a lot.

In the July August filing period - we've had 103 of these 500 responses on the submissions.

Surely this is a serious enough bug for you to prioritise?

DDCTLS-DEV-TEAM commented 2 years ago

Hello @charles-f

For us as a team it is our priority and to be honest, we are feeling your frustration too. We have had to escalate this and we now have a meeting booked for tomorrow with the team that owns the data (Unfortunately that's not us).

We are hoping we will have something tangible tomorrow for you or at least an update on what the plan is. We apologize for the inconvenience this is causing. It's just a bit out of our hands until we finally got the meeting booked.

Thank you again for your patience and we'll update you tomorrow.

charles-f commented 2 years ago

thanks @DDCTLS-DEV-TEAM appreciate the update.

charles-f commented 2 years ago

hi @DDCTLS-DEV-TEAM can you please let us know the update from today's meeting please

thanks

DDCTLS-DEV-TEAM commented 2 years ago

@charles-f Good morning,

The outcome of the meeting was that the backend systems have acknowledged there is an issue. They think that the backend system that they hit, doesn't update in time. They pull the period key from the backend which doesn't give a response in time, and they send us the error. However, the backend then sends them the correct response it just doesn't come in time.

So if you have a recent example over the last 30 days and if you can send the data to the SDS team we can send this on. Sorry, they need a recent error.

Thank you for your help in this one, hopefully, we are getting somewhere with it. Unfortunately, it's something we cant sort ourselves on the vat-api team.

aeverest commented 2 years ago

Great to hear that there is recognition of a source of this, looking forward to a solution :-)

LH200 commented 2 years ago

Hello,

Is there an ETA on when a fix will be applied for this?

We have had reports of this issue on the latest batch of returns.

A quick fix to ensure that the submission is not posted to HMRC would be preferable in the short term.

At the moment our system will state the submission hasn't worked because we expect a JSON response and not html:


<!DOCTYPE html> Error

Oops, an error occurred

This exception has been logged with id EXAMPLE.

charles-f commented 2 years ago

@LH200 thanks we have the exact same issue, and if you even attempt to resubmit then you get the rejection for duplicate submission also. Agreed I would prefer if the submission in fact doesn't happen as a result of this 500. And therefore a resubmission would be allowed.

This has now affected 121 of submissions in my product since June.

We're having to explain to users that it's an issue with HMRC's API and not our product.

Here is my latest example @DDCTLS-DEV-TEAM

<!DOCTYPE html> Error

Oops, an error occurred

This exception has been logged with id 7oi9c2i38.

DDCTLS-DEV-TEAM commented 2 years ago

@charles-f thank you for the update can you please send the VRN you used for that submission please to the SDS team and link in this Github issue they will forward it to us then thank you. Hopefully, the other teams then will have something to follow on with.

@LH200 unfortunately there isnt an ETA on this yet as the downstream services are still discussing the issue. The vat-api service and this team are basically a proxy to the backend services (different teams). So far the investigations are leading to the fact that we as in vat-api are timing out waiting on a response from the backend (One team) we hit, which then goes off to another backend (another team) and that one doesn't respond in time. Is the current theory. So we need to confirm the full issue before a fix can go in place.

We also have an issue where the backend service isn't available at all, we have a fix going through the pipeline to stop the HTML being sent and the error is sent in JSON. So we are hoping once that goes in you won't see any HTML anymore and the JSON mite gives you more to go off in terms of letting your users know.

Thank you all for your understanding on this one, we are actively investigating this and trying to get the right people in the room.

charles-f commented 1 year ago

We're still getting 500 errors on the response API, I've provided my team member who's in contact with Alex Anikeev about this issue - with the VRN for the follow up.

DDCTLS-DEV-TEAM commented 1 year ago

Good Morning @charles-f Thank you for the update and thank you for pushing things on your side. We are doing the same. We have had an increase in this issue and we are sending example VRNs every week to try and push this through. As you can imagine the downstream teams are very busy but I'm hoping this can be solved sooner rather than later. We will keep you updated as soon as we get anything.

In the meantime, if you can keep emailing in your VRNs when you're getting the issues that will also help and we apologize for the inconvenience this is causing.

Kind Regards DDCT Team

DDCTLS-DEV-TEAM commented 1 year ago

Good afternoon @charles-f There's some confusion around the VRNs provided by your teammate. Could you provide the example VRNs to the SDS team please so we can proceed with the investigation with certainty? Submission timestamps would also be helpful if possible.

Kind regards DDCT Team

charles-f commented 1 year ago

@DDCTLS-DEV-TEAM what is your process for those affected by this who does not have a receipt id as a result.

Where the obligation has been fulfilled and users want a receipt acknowledging submission has been made.

We have customers asking about this bug, and though it maybe partially resolved in that you no longer show the html error anymore (what I see from my company's product), there are still some 500 responses on the submission (with json response) and many more 503s and some 504s than before.

My assessment of this would be that the 'fix' has only moved the issue to another response code (503s) and where 500s still happen (they now return json).

Can I suggest that if you do not get a response in time, then you reject the submission - this would allow the user to resubmit at least and not keep causing these issues.

I look forward to hearing your answer about what is the process for what to do about no receipt id.

DDCTLS-DEV-TEAM commented 1 year ago

Hello @charles-f

Thank you for getting in touch. We have been investigating this since we last spoke and we have had some clarity of what is going on down stream.

So we (VAT-API) are hitting the downstream to submit. This has all gone through ok. Our downstream then go off further to update the user's data in the backends.

We (VAT-API) are throwing timeouts because our downstream doesn't respond in time. However, they have pushed that data onto the backends and it is in the process of updating/ submitting when we throw the timeout. Then after a certain amount of time (we are still in talks with the backends to understand this) our downstream then get a positive result but by then obviously, we have timed out.

Now we have upped our timeouts a little to try and reduce this problem. (as you can imagine this isn't a permanent fix and doesn't solve it for everyone) so we are trying to understand why these timesout are happening and what we can do about this. The VAT-API doesnt create the receipt id, that is given to us from a successful response. If the VAT-API is getting errors from the downstreams there is no way of us getting you that receipt id from inside the VAT-API service (at the moment). Unfortunately, at this time we dont know the ins and outs of how this is made and whether we could retrieve it at a later date, so say if you submitted the same thing, we could return something more useful like you have done this already here is the receipt id. We are trying to get to the bottom of this because that would help out alot of people like yourselves.

We will keep you updated as we go.

Kind Regards