microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 401 forks source link

[BUG] -Microservice fails to upgrade using ServiceFabric SDK 4.2.457.9590 on local5node cluster #1165

Open kernst123 opened 3 years ago

kernst123 commented 3 years ago

Describe the bug I have an application that consistently fails to upgrade during deployment with the following error when running against a local 5 node cluster:

'System.Hosting' reported Error for property 'Download:1.0:1.1:'. There was an error during download. Failed to download 'Store\Bentley.InteroperabilityType\Bentley.Interoperability.WebPkg.Config.2.0.0.0' directory from ImageStore to 'C:\SfDevCluster\Data_App_Node_1\Bentley.InteroperabilityType_App3\Bentley.Interoperability.WebPkg.Config.2.0.0.0'. Error: E_FAIL

This failure does not happen using ServiceFabric SDK 4.1.510.9590 It only started once I upgraded my local environment SDK to 4.2.457.9590

I am surprised above it references C:\SfDevCluster\Data_App**_Node_1** since my local cluster is running 5 node.

Area/Component:

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior The upgrade should be successful

Observed behavior: The application goes in an error state during upgrade and the upgrade process never progresses.

Screenshots image image

Service Fabric Runtime Version: ex: 7.1., 7.2.

Environment:

This is regression because if I uninstall all of servicefabric and then install ServiceFabric SDK 4.1.510.9590 and create a local 5 node cluster with that everything works fine.

Additional context Add any other context about the problem here.


Assignees: /cc @microsoft/service-fabric-triage

kernst123 commented 3 years ago

This also impacts restoring backups. If your cluster creates a backup if you restore the backup your cluster goes into the same error state with the same error.

kernst123 commented 3 years ago

This is what is logged from the ServiceFabric Cluster when the upgrade fails. This is also the same error that is logged when a backup is restored.

TryUnzipDirectory failed: src=C:\SfDevCluster\Data\_App\_Node_3\Bentley.InteroperabilityType_App3\Bentley.Interoperability.WebPkg.Config.2.0.0.0.zip dest=C:\SfDevCluster\Data\_App\_Node_3\Bentley.InteroperabilityType_App3\Bentley.Interoperability.WebPkg.Config.2.0.0.0 Note that long paths exceeding the Windows MAX_PATH limit are not supported for compressed packages. System.IO.IOException: The file 'C:\SfDevCluster\Data\_App\_Node_3\Bentley.InteroperabilityType_App3\Bentley.Interoperability.WebPkg.Config.2.0.0.0\eventFlowConfig.json' already exists. at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath) at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, Boolean useAsync) at ExtractToFile(ZipArchiveEntry source, String destinationFileName, Boolean overwrite) at ExtractRelativeToDirectory(ZipArchiveEntry source, String destinationDirectoryName, Boolean overwrite) at ExtractToDirectory(String sourceArchiveFileName, String destinationDirectoryName, Boolean overwriteFiles) at TryUnzipDirectoryInternal(Char* srcFileName, Char* destDirectoryPath, Char* errorMessageBuffer, Int32 errorMessageBufferSize, Boolean flushFile)

craftyhouse commented 3 years ago

What happens if you create a fresh 7.2 cluster? Everything deploy fine?

This may be the reason below from another thread. Are you able to update the zip name? “The .net framework unzip doesn’t support incremental zip of same name. If you added a file into existing zip archive and the file has same name(path) as an existing one, the extract hits this error. The first entry extracts to the path and the next entry of same path can’t be extracted because the file destination exists already. To verify this is the case, I guess powershell Expand-Archive could be used on local machine. I assume it will hit the same error.”

kernst123 commented 3 years ago

Sorry but I am confused by multiple comments in your last post. You asked what happens if I create a fresh cluster. I am starting out with a fresh cluster with nothing on it. I deploy version 1.0.0.0 of the application and it deploys/installs correctly. I then deploy a newer version of the same app version 2.0.0.0 and it fails with the error above.

You also mentioned adding a file into an existing zip archive. I'm not running any zip commands. I'm simply packaging up the servicefabric application using the sdk powershell module commands. This all worked fine until version 4.2.457.9590 of the sdk came out.

I'm not sure if I'm fully understanding your post.

craftyhouse commented 3 years ago

"This is regression because if I uninstall all of servicefabric and then install ServiceFabric SDK 4.1.510.9590 and create a local 5 node cluster with that everything works fine."

Did you do the same with the following from scratch? Not clear to me Microsoft Azure Service Fabric SDK Version: 4.2.457.9590 Local cluster is 7.2.457.9590

Could you also validate against CU7 that was recently released? Versions 7.2 CU7 | 7.2.477.9590

If you still hit this error we'll need an incident created to triage.

kernst123 commented 3 years ago

What does CU7 stand for? Do you have a link? Looking at the installer the latest version I see is from March 9. @.***D72A25.8A49D510]

Is that the version you want me to try?

From: Mike Craft @.> Sent: Monday, April 5, 2021 1:35 PM To: microsoft/service-fabric @.> Cc: Kevin Ernst @.>; Author @.> Subject: Re: [microsoft/service-fabric] [BUG] -Microservice fails to upgrade using ServiceFabric SDK 4.2.457.9590 on local5node cluster (#1165)

"This is regression because if I uninstall all of servicefabric and then install ServiceFabric SDK 4.1.510.9590 and create a local 5 node cluster with that everything works fine."

Did you do the same with the following from scratch? Not clear to me Microsoft Azure Service Fabric SDK Version: 4.2.457.9590 Local cluster is 7.2.457.9590

Could you also validate against CU7 that was recently released? Versions 7.2 CU7 | 7.2.477.9590

If you still hit this error we'll need an incident created to triage.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_microsoft_service-2Dfabric_issues_1165-23issuecomment-2D813523749&d=DwMCaQ&c=hmGTLOph1qd_VnCqj81HzEWkDaxmYdIWRBdoFggzhj8&r=B9i9u3ndn_TEKn0Byu-2yB9Tw3lp3fC37Jpl0Mm7Zxc&m=CRZHUPm425HY7zxshRVrs3I5LdKBPzZXlY56-Z4wkVQ&s=AH361Eu34Ypo2WlTHb-7VCIluo3uOWDAbGKDS4qNi5k&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AJVHGH6DQL6RF23S7AAXQZ3THHYDTANCNFSM4YUBTKRQ&d=DwMCaQ&c=hmGTLOph1qd_VnCqj81HzEWkDaxmYdIWRBdoFggzhj8&r=B9i9u3ndn_TEKn0Byu-2yB9Tw3lp3fC37Jpl0Mm7Zxc&m=CRZHUPm425HY7zxshRVrs3I5LdKBPzZXlY56-Z4wkVQ&s=CrSaxqAfgojGjm4nxZPm-V77foGEDBQ0R2g7rTQ-D_w&e=.

craftyhouse commented 3 years ago

You can grab the latest here https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-get-started#sdk-installation-only

kernst123 commented 3 years ago

Hi Mike,

ServiceFabric SDK Version 4.2.457.9590 was the version that I reported the problem against. So I know that version won’t work. -- Kevin

From: Mike Craft @.> Sent: Monday, April 5, 2021 1:35 PM To: microsoft/service-fabric @.> Cc: Kevin Ernst @.>; Author @.> Subject: Re: [microsoft/service-fabric] [BUG] -Microservice fails to upgrade using ServiceFabric SDK 4.2.457.9590 on local5node cluster (#1165)

"This is regression because if I uninstall all of servicefabric and then install ServiceFabric SDK 4.1.510.9590 and create a local 5 node cluster with that everything works fine."

Did you do the same with the following from scratch? Not clear to me Microsoft Azure Service Fabric SDK Version: 4.2.457.9590 Local cluster is 7.2.457.9590

Could you also validate against CU7 that was recently released? Versions 7.2 CU7 | 7.2.477.9590

If you still hit this error we'll need an incident created to triage.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_microsoft_service-2Dfabric_issues_1165-23issuecomment-2D813523749&d=DwMCaQ&c=hmGTLOph1qd_VnCqj81HzEWkDaxmYdIWRBdoFggzhj8&r=B9i9u3ndn_TEKn0Byu-2yB9Tw3lp3fC37Jpl0Mm7Zxc&m=CRZHUPm425HY7zxshRVrs3I5LdKBPzZXlY56-Z4wkVQ&s=AH361Eu34Ypo2WlTHb-7VCIluo3uOWDAbGKDS4qNi5k&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AJVHGH6DQL6RF23S7AAXQZ3THHYDTANCNFSM4YUBTKRQ&d=DwMCaQ&c=hmGTLOph1qd_VnCqj81HzEWkDaxmYdIWRBdoFggzhj8&r=B9i9u3ndn_TEKn0Byu-2yB9Tw3lp3fC37Jpl0Mm7Zxc&m=CRZHUPm425HY7zxshRVrs3I5LdKBPzZXlY56-Z4wkVQ&s=CrSaxqAfgojGjm4nxZPm-V77foGEDBQ0R2g7rTQ-D_w&e=.

craftyhouse commented 3 years ago

You also mentioned adding a file into an existing zip archive. I'm not running any zip commands. I'm simply packaging up the servicefabric application using the sdk powershell module commands. This all worked fine until version 4.2.457.9590 of the sdk came out.

I'm not sure if I'm fully understanding your post.

This seems to be the main reason 'C:\SfDevCluster\Data_App_Node_3\Bentley.InteroperabilityType_App3\Bentley.Interoperability.WebPkg.Config.2.0.0.0\eventFlowConfig.json' already exists

Are you able to update the config version in your packaging? e.g. WebPkg.Config.2.0.0.0 to WebPkg.Config.2.0.0.1?

Could you also validate against 7.2 CU7 that was recently released? I suspect this won't change the behavior due to the above, but worth verifying if not to much hassle. Versions 7.2 CU7 | 7.2.477.9590

I'd highly recommend opening a support case at this point to validate any regression/concerns.

kernst123 commented 3 years ago

Just wanted to post the end result of this problem. I opened a help ticket with Microsoft and they found the bug and are fixing it. Thank you to everyone for all their help and time on the matter.