Open wminos opened 6 years ago
I already see 'https://github.com/Azure/service-fabric-issues/issues/199' issue. I am working on Windows 10 and using the default anti-virus (Windows Defender). The C:\SfDevCluster folder was excluded from the scan.
Same here. SDK 3.0.472, both VS 2017 and 2015. Only resetting local cluster helps but it's annoying and time-consuming.
+1. Does resetting even work for you folks? For me resetting also fails. I am 99% sure this is because the fabricDNSService.exe doesnt shutdown. I am unable to kill it manually as well (access denied - pskill, procexp, nothing works). Only a PC restart does solve it. Such a waster of time! MSFT, please fix!
Without knowing the reason the cluster get's stuck, this is what's going on and maybe can help you workaround until we know more.
Visual Studio (and the Local Cluster Manager) probes the cluster endpoint locally, if it's not responding you will see this error. So there's a good chance the local cluster is unresponsive, hence resetting the cluster helps.
Those of you who run in to this issue, I would appreciate if you can share trace files from the cluster (SfDevCluster\Log\Traces) - thanks.
Emptied C:\SfDevCluster\Log\Traces\, Opened VS2015, pressed F5, got the same error "'Unable to determine whether the application is installed on the cluster or not", closed VS, stopped local cluster, zipped all newly created files, got this. Please let me know if these logs help or if I need to gather again or somethings else. Happy to help!
Here's another exception that seems related to this issue:
Started executing script 'Publish-NewServiceFabricApplication'.
powershell -NonInteractive -NoProfile -WindowStyle Hidden -ExecutionPolicy Bypass -Command "[void](Connect-ServiceFabricCluster); Import-Module 'C:\Program Files\Microsoft SDKs\Service Fabric\Tools\PSModule\ServiceFabricSDK\ServiceFabricSDK.psm1'; Publish-NewServiceFabricApplication -ApplicationPackagePath '...\PublishProfiles\..\ApplicationParameters\Local.1Node.xml' -ApplicationParameter @{_WFDebugParams_='[{ ... }]'} -Action Create -SkipPackageValidation:$true -ErrorAction Stop"
Creating application...
New-ServiceFabricApplication : Could not ping any of the provided Service Fabric gateway endpoints.
At C:\Program Files\Microsoft SDKs\Service
Fabric\Tools\PSModule\ServiceFabricSDK\Publish-NewServiceFabricApplication.ps1:279 char:9
+ New-ServiceFabricApplication -ApplicationName $ApplicationNam ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (Microsoft.Servi...usterConnection:ClusterConnection) [New-ServiceFabr
icApplication], FabricTransientException
+ FullyQualifiedErrorId : CreateApplicationInstanceErrorId,Microsoft.ServiceFabric.Powershell.NewApplication
Finished executing script 'Publish-NewServiceFabricApplication'.
Time elapsed: 00:02:06.7432516
I just updated to latest and now I get this every time I hit F5 in Visual Studio 2017. -- I have to restart the local cluster every time I want to run my project, it's super annoying.
@BrainSlugs83 , when this occurs, are you able to bring up the Service Fabric Explorer through the tray icon "Manage Local Cluster" and check the health state of the cluster?
Does this also reproduce for you if you use a different Application Debug Mode? I am assuming this property is currently set to the default "Refresh Application"? Does it repro if you are using "Remove Application"?
My local cluster goes so unstable that SFE can't manage it, shows an error suggesting to restart it but this doesn't help either, nor restarting the service. Only rebooting, very annoying.
Don't know whether this is related. FabricDCA.exe and FabricFAS.exe silently crash every minute and generate 170 MB of crash dumps at C:\SfDevCluster\Log\CrashDumps.
@abatishchev can you make those crash dumps available somewhere? @rishirsinha might want to take a look at these FabricDCA and FabricFAS crashes.
@anmolah can you add the right set of people to this thread?
Please grab them from \\alexbat-id1\CrashDumps
I'm troubleshooting unrelated issue, turned on Fusion Logging and saw the following error:
=== Pre-bind state information ===
LOG: DisplayName = Microsoft.ServiceFabric.Data.Interfaces, Version=5.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35, processorArchitecture=AMD64
(Fully-specified)
LOG: Appbase = file:///C:/SfDevCluster/Data/_App/_Node_0/__FabricSystem_App4294967295/FAS.Code.Current/
LOG: Initial PrivatePath = NULL
LOG: Dynamic Base = NULL
LOG: Cache Base = NULL
LOG: AppName = FabricFAS.exe
Calling assembly : (Unknown).
===
LOG: This bind starts in default load context.
LOG: No application configuration file found.
LOG: Using host configuration file:
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework64\v4.0.30319\config\machine.config.
LOG: GAC Lookup was unsuccessful.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_App/_Node_0/__FabricSystem_App4294967295/FAS.Code.Current/Microsoft.ServiceFabric.Data.Interfaces.DLL.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_App/_Node_0/__FabricSystem_App4294967295/FAS.Code.Current/Microsoft.ServiceFabric.Data.Interfaces/Microsoft.ServiceFabric.Data.Interfaces.DLL.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_App/_Node_0/__FabricSystem_App4294967295/FAS.Code.Current/Microsoft.ServiceFabric.Data.Interfaces.EXE.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_App/_Node_0/__FabricSystem_App4294967295/FAS.Code.Current/Microsoft.ServiceFabric.Data.Interfaces/Microsoft.ServiceFabric.Data.Interfaces.EXE.
LOG: All probing URLs attempted and failed.
And indeed Microsoft.ServiceFabric.Data.Interfaces.dll
is not present in C:\SfDevCluster\Data\_App\_Node_0\__FabricSystem_App4294967295\FAS.Code.Current\
.
And another error:
=== Pre-bind state information ===
LOG: DisplayName = System.Fabric.Strings, Version=6.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35
(Fully-specified)
LOG: Appbase = file:///C:/SfDevCluster/Data/_Node_0/Fabric/DCA.Code/
LOG: Initial PrivatePath = NULL
LOG: Dynamic Base = NULL
LOG: Cache Base = NULL
LOG: AppName = FabricDCA.exe
Calling assembly : (Unknown).
===
LOG: This bind starts in default load context.
LOG: Using application configuration file: C:\SfDevCluster\Data\_Node_0\Fabric\DCA.Code\FabricDCA.exe.Config
LOG: Using host configuration file:
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework64\v4.0.30319\config\machine.config.
LOG: GAC Lookup was unsuccessful.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_Node_0/Fabric/DCA.Code/System.Fabric.Strings.DLL.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_Node_0/Fabric/DCA.Code/System.Fabric.Strings/System.Fabric.Strings.DLL.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_Node_0/Fabric/DCA.Code/System.Fabric.Strings.EXE.
LOG: Attempting download of new URL file:///C:/SfDevCluster/Data/_Node_0/Fabric/DCA.Code/System.Fabric.Strings/System.Fabric.Strings.EXE.
LOG: All probing URLs attempted and failed.
And again System.Fabric.Strings.dll
is not present in C:\SfDevCluster\Data\_Node_0\Fabric\DCA.Code\
Does the code rely these assemblies to present in GAC? For me they weren't there. Registered. Will see whether it'll fix those crashes.
Update: crashes are gone now.
Update 2: the said error seems to be gone too.
+1 still having this problem. SDK 3.1.269, VS 2017 Enterprise.
I too am facing the same issue. Figuring out what the best option is at this point and am seriously considering using a VM on Azure, non-domain joined or anything alike, and force down the installation of an older SDK (if possible).
Rollback to SDK 3.0456 fix my problem. I suspect SDK 3.1.269 installing Service Fabric runtime 6.2.269, could it be the cause?
Hey guys, any update on this one? Literally none of the developers in my team can make 6.2 work locally. Downgrading fixes the issue.
Can you try uninstall of the previous version of the runtime, then reboot the machine and re-install the latest version?
@rishirsinha thanks for a quick response. I've tried uninstalling runtime and sdk and manually cleaning up any leftovers, including SfDevCluster folder and the issue still persists. I can see the following errors in the event log:
CertCreateSelfSignCertificate failed: E_ACCESSDENIED
ipcServer->SecuritySettings.CreateSelfGeneratedCertSslServer error=S_OK
Fabric Node open failed with error code = E_ACCESSDENIED
happening on every retry of cluster creation.
@aloneguid
This seems like a different issue.
@RajeetN
Rajeet any ideas what this might be?
lease_traces_6.2.274.9494_131720609089720816_0.zip
I've tried to run from the terminal to get more details, attaching output and traces here as well:
\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup> .\DevClusterSetup.ps1 -CreateOneNodeCluster
WARNING: A local Service Fabric Cluster already exists on this machine and will be removed.
Do you want to continue [Y/N]?: y
Removing cluster configuration...
Cleaning existing certificates...
Certificates removed.
Stopping all logman sessions...
Cleaning log and data folder...
Using Cluster Data Root: C:\SfDevCluster\Data
Using Cluster Log Root: C:\SfDevCluster\Log
The generated json path is C:\Users\ivang\AppData\Local\Temp\tmp978A.tmp.json
Processing and validating cluster config.
Create node configuration succeeded
Starting service FabricHostSvc. This may take a few minutes...
Waiting for Service Fabric Cluster to be ready. This may take a few minutes...
Local Cluster ready status: 4% completed.
Local Cluster ready status: 8% completed.
Local Cluster ready status: 12% completed.
Local Cluster ready status: 17% completed.
Local Cluster ready status: 21% completed.
Local Cluster ready status: 25% completed.
Local Cluster ready status: 29% completed.
Local Cluster ready status: 33% completed.
Local Cluster ready status: 38% completed.
Local Cluster ready status: 42% completed.
Local Cluster ready status: 46% completed.
Local Cluster ready status: 50% completed.
Local Cluster ready status: 54% completed.
Local Cluster ready status: 58% completed.
Local Cluster ready status: 62% completed.
Local Cluster ready status: 67% completed.
Local Cluster ready status: 71% completed.
Local Cluster ready status: 75% completed.
Local Cluster ready status: 79% completed.
Local Cluster ready status: 83% completed.
Local Cluster ready status: 88% completed.
Local Cluster ready status: 92% completed.
Local Cluster ready status: 96% completed.
Local Cluster ready status: 100% completed.
WARNING: Service Fabric Cluster is taking longer than expected to connect.
Waiting for Naming Service to be ready. This may take a few minutes...
No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
Connect-ServiceFabricCluster : No cluster endpoint is reachable, please check if there is connectivity/firewall/DNS issue.
At C:\Program Files\Microsoft SDKs\Service Fabric\Tools\Scripts\ClusterSetupUtilities.psm1:620 char:12
+ [void](Connect-ServiceFabricCluster @connParams)
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidOperation: (:) [Connect-ServiceFabricCluster], FabricException
+ FullyQualifiedErrorId : TestClusterConnectionErrorId,Microsoft.ServiceFabric.Powershell.ConnectCluster
I tried the same as @aloneguid and have some success. the only difference I think was that I had stopped the "Internet Connection Sharing (ICS)" windows service before running the command to create the local cluster.
C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup> .\DevClusterSetup.ps1 -CreateOneNodeCluster
WARNING: A local Service Fabric Cluster already exists on this machine and will be removed.
Do you want to continue [Y/N]?: y
Removing cluster configuration...
Cleaning existing certificates...
Certificates removed.
Stopping all logman sessions...
Cleaning log and data folder...
Using Cluster Data Root: C:\SfDevCluster\Data
Using Cluster Log Root: C:\SfDevCluster\Log
The generated json path is C:\Users\zunem\AppData\Local\Temp\tmp96A5.tmp.json
Processing and validating cluster config.
Create node configuration succeeded
Starting service FabricHostSvc. This may take a few minutes...
Waiting for Service Fabric Cluster to be ready. This may take a few minutes...
Local Cluster ready status: 4% completed.
Local Cluster ready status: 100% completed.
Waiting for Naming Service to be ready. This may take a few minutes...
Naming Service is ready now...
Local Service Fabric Cluster created successfully.
=================================================
## To connect using Powershell, open an a new powershell window and connect using 'Connect-ServiceFabricCluster' command (without any arguments)."
## To connect using Service Fabric Explorer, run ServiceFabricExplorer and connect using 'Local/OneBox Cluster'."
## To manage using Service Fabric Local Cluster Manager (system tray app), run ServiceFabricLocalClusterManager.exe"
=================================================
C:\Program Files\Microsoft SDKs\Service Fabric\ClusterSetup>
By the way, the ICS has now automatically restarted. So, in effect, both ICS and SFCluster are running and things look a stable so far.
@manimaranm7 I've tried stopping ICS, didn't make a difference for me :(
I ran some commands to figure out what was using that port on my machine (can't remember now, but I just googled it), and it turned out it was the Internet Connection Sharing
service (which is on by default in the latest windows updates -- even if you turn it off, and fully disable it, sometimes it turns itself back on...) -- anyway, anytime I have this issue, I just pop open services.msc
and disable it again, and the problem goes away for a while (until ICS turns itself back on).
Considering that this app is having port conflicts with a well known, widely-deployed service, can you guys just update the port your app is using for local development?
@dbreshears -- sorry, I don't remember if I was able to do that or not -- I just know that unless I have the Internet Connection Sharing
service disabled, that I can't deploy new code. [There is a 100% (inverse) correlation between that service running, and being able to deploy code to the local Service Fabric instance for me -- over the last 2 months, and on multiple machines, ICS has been the cause every single time.]
And no, I'm not using the Refresh Application
setting*, instead, I'm using Remove Application
.
(*The Refresh Application
setting has always been really flaky for me, and I've got my own IActorStateManager
implementation that's backed by Azure Cosmos Document DB, so it's not big deal if all the data in the cluster gets wiped every time I deploy or debug it; all I lose is reminders, and my actors just recreate those when the service comes up.)
@BrainSlugs83, I thought the ICS issue was resolved in 6.2 release. Let us know if on that version and still seeing the issue.
One of reasons, why this error happen inside of Visual Studio is following.
These steps lead to the error
Unable to determine whether the application is installed on the cluster or not
To workaround it, in the fist step you should select "Local.1Node.xml"
It would be great if the team could change this behavior. This error message can mean many things and in this specific case didn't help me at all.
@ddobric I tried publishing with Local.1Node.xml selected but I still can't get the app to run locally.
This is still happening, any updates on this?
@healthycola - What symptoms are you seeing? There are a few different scenarios in this thread.
Does any one still face the 'Unable to determine whether the application is installed on the cluster or not' errors with the latest runtime/sdks ? If yes, can you please provide repro steps ?
I do still face the issue with the latest sdk. Resetting, restarting, nothing helps.
@avichalchum Can you please describe, how do enter into this state ? What is the exact state of the cluster at that time ? Is the fabrichost svc in running state ? To fix the issue I need to repro it myself to makeout what is wrong.
@anantshankar17 I don't know how to enter this state as it is always in the state. As in, I restarted my computer, I force closed the Service Fabric local cluster manager and opened it again, I reset the cluster, I even uninstalled the SDK and reinstalled it. However, in Visual Studio when I try to deploy the app, it always says error while deploying with the error, "'Unable to determine whether the application is installed on the cluster or not". The fabrichost svc is in the running state at that time and the state of the cluster is healthy and the cluster manager opens fine showing everything is good. Let me know what other information I can give.
@avichalchum Visual Studio runs "Connect-ServiceFabricCluster" and "Get-ServiceFabricApplication -ApplicationName {appname}" commands to check whether the applications is deployed on the local cluster or not. Seems like these commands fails. When you hit the error in VS, please open a powershell window and run these commands and see if it works.
I have the SAME exactly issues as @avichalchum There is no set of "steps to reproduce this. It happens even when I start from a fresh reboot
I frequently have this issue. I believe it relates to networking.
On a fresh reboot, I can often get a local debug to work. After connecting to my org's VPN, the cluster's state cannot be determined. Disconnecting from the VPN, resetting the cluster, and trying again does not work.
Our team has played around with creating additional lists of fallback DNS servers in internet settings. For some of us and in some locations this works (I.E. home vs office ISP), for others it doesn't. Our documentation's steps:
Step 1: Stop the cluster in Service Fabric via the icon in the system tray Step 2: Open your network adapter setting for the current connection and right click Properties. Step 3: Double click IPv4 and click the Advanced button Step 4: Click the DNS tab and in this order: DNS servers of your internet provider, the VPN DNS (172.0.0.1) and your local IP Address to the list of DNS server, making sure you put your local IP Address last in the list (this is the important part) Step 5: Close all the open dialog boxes Step 6: Right click on the SF Icon in the System tray and Start the Cluster Once the cluster has started, you should see that your IP Address is still listed as 3rd in the DNS server list. You should be able to debug normally now in Visual Studio.
@Ryanman With regards to the DNS settings above, are you setting the dns settings for the local network connection or the VPN adapter connection?
Larlew - this was for the local network connection, not the VPN I believe.
My suggestion comes with large caveats but if you're at your wit's end it's worth a shot. The VPN implementation for the environment I'm working with SF is by far the worst I've ever seen, with extremely unstable split tunneling etc. It's not a recipe for success with SF.
I faced with the same issue. The actual problem was in publish profile. The ConnectionEndpoint parameter was empty:
<ClusterConnectionParameters ConnectionEndpoint="" />
After removing ConnectionEndpoint parameter it works well. I think it is'n the only one reason of this problem but my solution can be useful for someone.
SDK Version: Microsoft Azure Service Fabric SDK - 3.0.472
I did not do it for the first time but now it occurs more than once a day and I am resetting it every time. I am mainly developing in the Visual Studio 2017 IDE, and I am not sure if this is a bug, but it is inconvenient for development.