Open KyleTheAutomator opened 6 years ago
Can you share the generated json template? C:\Users\kthompson\AppData\Local\Temp\tmp3B1A.tmp.json
{
"name": "DevCluster",
"clusterConfigurationVersion": "1.0.0",
"apiVersion": "10-2017",
"nodes": [
{
"nodeName": "_Node_0",
"iPAddress": "ComputerFullName",
"nodeTypeRef": "NodeType0",
"faultDomain": "fd:/0",
"upgradeDomain": "0"
}
],
"properties": {
"diagnosticsStore": {
"metadata": "Please replace the diagnostics file share with an actual file share accessible from all cluster machines.",
"dataDeletionAgeInDays": "3",
"storeType": "FileShare",
"connectionstring": "%systemdrive%\\ProgramData\\SF\\DiagnosticsStore"
},
"nodeTypes": [
{
"name": "NodeType0",
"clientConnectionEndpointPort": "19000",
"clusterConnectionEndpointPort": "19002",
"leaseDriverEndpointPort": "19001",
"serviceConnectionEndpointPort": "19006",
"httpGatewayEndpointPort": "19080",
"reverseProxyEndpointPort": "19081",
"applicationPorts": {
"startPort": "30001",
"endPort": "31000"
},
"isPrimary": true
}
],
"fabricSettings": [
{
"name": "Setup",
"parameters": [
{
"name": "FabricDataRoot",
"value": "C:\\SfDevCluster\\Data"
},
{
"name": "FabricLogRoot",
"value": "C:\\SfDevCluster\\Log"
},
{
"value": "true",
"name": "IsDevCluster"
}
]
},
{
"name": "Diagnostics",
"parameters": [
{
"name": "ProducerInstances",
"value": "ServiceFabricEtlFile,ServiceFabricPerfCtrFolder"
},
{
"name": "MaxDiskQuotaInMB",
"value": "10240"
},
{
"name": "EnableCircularTraceSession",
"value": "true"
}
]
},
{
"name": "FabricClient",
"parameters": [
{
"name": "HealthReportSendInterval",
"value": "0"
}
]
},
{
"name": "Failover",
"parameters": [
{
"name": "SendToFMTimeout",
"value": "1"
},
{
"name": "NodeUpRetryInterval",
"value": "1"
}
]
},
{
"name": "Federation",
"parameters": [
{
"name": "NodeIdGeneratorVersion",
"value": "V4"
},
{
"name": "UnresponsiveDuration",
"value": "0"
},
{
"name": "ProcessAssertExitTimeout",
"value": "86400"
}
]
},
{
"name": "Hosting",
"parameters": [
{
"name": "EndpointProviderEnabled",
"value": "true"
},
{
"name": "RunAsPolicyEnabled",
"value": "true"
},
{
"name": "EnableProcessDebugging",
"value": "true"
},
{
"name": "DeactivationScanInterval",
"value": "600"
},
{
"name": "DeactivationGraceInterval",
"value": "2"
},
{
"name": "ServiceTypeRegistrationTimeout",
"value": "20"
},
{
"name": "CacheCleanupScanInterval",
"value": "300"
},
{
"name": "DeploymentRetryBackoffInterval",
"value": "1"
}
]
},
{
"name": "Management",
"parameters": [
{
"name": "ImageStoreConnectionString",
"value": "ImageStoreConnectionStringPlaceHolder"
},
{
"name": "ImageCachingEnabled",
"value": "false"
},
{
"name": "EnableDeploymentAtDataRoot",
"value": "true"
},
{
"name": "DisableChecksumValidation",
"value": "true"
}
]
},
{
"name": "PlacementAndLoadBalancing",
"parameters": [
{
"name": "MinLoadBalancingInterval",
"value": "300"
},
{
"name": "TraceCRMReasons",
"value": "false"
}
]
},
{
"name": "ReconfigurationAgent",
"parameters": [
{
"name": "IsDeactivationInfoEnabled",
"value": "true"
},
{
"name": "ServiceApiHealthDuration",
"value": "20"
},
{
"name": "ServiceReconfigurationApiHealthDuration",
"value": "20"
},
{
"name": "LocalHealthReportingTimerInterval",
"value": "5"
},
{
"name": "RAUpgradeProgressCheckInterval",
"value": "3"
},
{
"name": "RAPMessageRetryInterval",
"value": "0.5"
},
{
"name": "MinimumIntervalBetweenRAPMessageRetry",
"value": "0.5"
}
]
},
{
"name": "ServiceFabricEtlFile",
"parameters": [
{
"name": "DataDeletionAgeInDays",
"value": "3"
},
{
"name": "IsEnabled",
"value": "true"
},
{
"name": "ProducerType",
"value": "EtlFileProducer"
},
{
"name": "EtlReadIntervalInMinutes",
"value": "5"
}
]
},
{
"name": "ServiceFabricPerfCtrFolder",
"parameters": [
{
"name": "DataDeletionAgeInDays",
"value": "3"
},
{
"name": "IsEnabled",
"value": "true"
},
{
"name": "ProducerType",
"value": "FolderProducer"
},
{
"name": "FolderType",
"value": "ServiceFabricPerformanceCounters"
}
]
},
{
"name": "Trace/Etw",
"parameters": [
{
"name": "Level",
"value": "4"
}
]
},
{
"name": "TransactionalReplicator",
"parameters": [
{
"name": "CheckpointThresholdInMB",
"value": "64"
}
]
}
],
"addOnFeatures": [
"DnsService"
]
}
}
@maburlik - I don't see anything obvious from the manifest.
I wonder if you are seeing the same issue as reported in microsoft/service-fabric-issues#1056. Would you mind checking:
Fabric.exe
process is running or not.Spot on. I see the following in my logs:
Fabric Node open failed with error code = E_ACCESSDENIED
Also seeing:
HostedService: _Node_0 on node id bf865279ba277deb864a976fbf4c200e terminated unexpectedly with code 7167 and process name Fabric.exe
HostedServiceInstance:HostedService/_Node_0_Fabric terminated with exitcode 7167
client-localhost:19000/127.0.0.1:19000: error = 2147943625, failureCount=93. Filter by (type~Transport.St && ~"(?i)localhost:19000") to get listener lifecycle. Connect failure is expected if listener was never started, or listener/its process was stopped before/during connecting.
One of our primary use cases in evaluating Service Fabric is to use it for containers. Is there documentation on how to configure a dev cluster for containers using self signed tls certs?
Thanks @knizkar - let's track this on microsoft/service-fabric-issues#1056.
@MisterPuffyPants - Regarding setting up a dev cluster with containers, a doc will be posted one of the following days, as this is only officially supported in 6.2. Main thing is to make sure that the docker service is started when creating the cluster, that will enable the support in Service Fabric.
Exactly same issue here. Any updates?
Had the same issue, the only thing that helped - going back to 6.2.283/3.1.283
Any updates? Still see it in the newest version
@EvilAvenger: Catching up on this issue, have you gone through the solutions proposed in this issue? https://github.com/Azure/service-fabric-issues/issues/1056
@MikkelHegn
Yes I did, it does not work. Currently the issues is revealing on our deployment machine, so I can't properly test it (as it blocks my team).
The only thing that really helps is installation of 6.2.283.9494. (Installation of prior version, but copying files from 6.2..283 to "C:\Program Files\Microsoft SDKs\Service Fabric" helps as well.)
All the other versions are not working, so it might be, that the issue has been brought somewhere in *.301;
What I've tried:
Event log issues: Currently I can't provide full event log as I've reinstalled the service, I've seen several records in EL:
1) FileChangeMonitor failed with E_ACCESSDENIED 2) FolderACLManager::Install failed with error E_INVALIDARG 3) GetFileAttributesEx failed with the following error 5
Thanks for your patience on this one @EvilAvenger. @maburlik for the diagnostics info above, do you have any ideas what might be causing this?
Also blocked by this now @MikkelHegn . Anyone any closer to figuring out what is going on? I have tried all the workarounds and it's no use.
Folks, if the workaround mentioned in microsoft/service-fabric-issues#1056 isn't working for you, can you please share full setup logs from the environment? May be you are running into something else here.
(Assuming Windows) The reg key HKLM\SOFTWARE\Microsoft\ServiceFabric\FabricLogRoot should point to the location of the logs. Zip the directory and attach the file here; you can also zip and email it to us (raunakp, or mikhegn at microsoft dot com) if you want.
Logs attached.
Just to give my two cents on this issue. I was also having the same problem with Windows 10 and the latest SDK. I had checked the windows firewall, removed webroot av, reinstalled the SDK multiple time, reverted back to older SDKs, checked the folder permissions, changed to network service account and any other solutions proposed in this issue https://github.com/Azure/service-fabric-issues/issues/1056
The fix for me was quite simple, @JayRidge95 noticed the hostname was being chopped in the event logs. My computer name was longer than the 15 character net bios name. So we changed my computer name to be shorter than 15 characters, reinstalled the SDK and it worked fine.
Bit of an odd one but it took me about 3 days to get to that point so this might save some people time.
@tjackadams this works like a charm.I have just shorten the computer name.I was stuck in this issue since last 4 days.
@tjackadams thanks. It worked. Dear SF team can you fix this issue or at least provide a better error message to identify the issue and solution quickly.
This workaround did not work for me. :( It's still not working.
@raunakpandya is there any update on this?
@andrewcoll +1 Not working for me as well
@andrewcoll - Have you tried the workaround to set the FabricContainerAppsEnabled to false? If not, can you try adding the following section under the hosting section in the ClusterManifestTemplate.json files (depending on the type of one box you bringing up, there would be one file) under %programifiles%\Microsoft SDKs\Service Fabric\ClusterSetup:
Add the following section under the Hosting tab -
{
"name": "FabricContainerAppsEnabled",
"value": "false"
}
@raunakpandya yes, I tried that, it didn't work either. I attached my logs in a previous comment.
Yes. I did look at the logs. Strange, which json file did you modify, can you attach the same? Also, what one box mode are you trying to bring up (secure/unsecure/ 1 box/5 box)?
The @raunakpandya 's answer work for me. Thanks!!!
@tjackadams your solution worked for me. Shorten computer name (was longer than 15 characters). Thank you!
FabricContainerAppsEnabled
@raunakpandya could you please explain why disabling this settings solve this issue ?
@Kassoul - This has the details: https://github.com/Azure/service-fabric-issues/issues/1056#issuecomment-400413031
By disabling that, the self signed certificate is no longer created.
I have seen the same error when trying to start up my local cluster. In my case, I noticed that some dll is missing from the Fabric.exe - from 'HostService:
I have seen the same error when trying to start up my local cluster. In my case, I noticed that some dll is missing from the Fabric.exe - from 'HostService: on node id terminated unexpectedly with code 3221225781 and process name Fabric.exe' error message. For me, The issue was that some of the vc++ dlls went missing and can be fixed by reinstall "C:\Program Files\Microsoft Service Fabric\bin\Fabric\Fabric.Code\vcredist_x64.exe".
This fixes the issue for me!
In my case Service Fabric was not able to bind address 192.168.0.108:19080
which was causing this issue.
If any of the above-mentioned solutions didn't work for you, try the following.
netstat -an | Select-String :19080
, if you don't see anything like TCP [::]:19080 [::]:0 LISTENING
it could mostly mean there was an error when SF was trying to bind ip to listen on port 19080
.Unable to bind to the underlying transport for 192.168.0.108:19080
19080 is the http gateway used by service fabric. Note the ip listed herenetsh http show iplisten
if you see an entry with the same ip as above(192.168.0.108
in this case), delete that entry by running netsh http delete iplisten ipaddress=192.168.0.108
netstat -an | Select-String :19080
again and this time you should see TCP [::]:19080 [::]:0 LISTENING
This fixed the issue for me.
I've downloaded the Service Fabric SDK for VS 2017 from here: http://www.microsoft.com/web/handlers/webpi.ashx?command=getinstallerredirect&appid=MicrosoftAzure-ServiceFabric-CoreSDK
The initial install on my Windows 10 v1709 workstation (fully patched) completes successfully. The problem manifests when I try to setup a cluster:
Pulling my hair out with this over the last couple days. Here's thing's I've tried: