microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.03k stars 400 forks source link

Unable to resolve service DNS name #705

Open MarkDixonTech opened 6 years ago

MarkDixonTech commented 6 years ago

I'm having trouble communicating between two services in my local cluster using DNS name as per this article:

https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-dnsservice

The DNS Service is running in Service Fabric Explorer and I can perform an nslookup against it as follows:

nslookup application.test 127.0.0.1

Server: localhost Address: 127.0.0.1

Name: application.test

However, when I try and ping application.test I get the following:

Ping request could not find host application.test. Please check the name and try again.

When I run nslookup against an invalid domain then I get an expected Non-existent domain error:

*** localhost can't find application.test1: Non-existent domain

I have confirmed that my service is exposing an endpoint as follows:

Resolve-ServiceFabricService -PartitionKindSingleton -ServiceName fabric:/MyApplication/MyService

ServiceName : fabric:/MyApplication/MyService Endpoints : Address: {"Endpoints":{"EndpointName":"http:\/\/machinename:8187\/"}} Role: Stateless

I have run the .\DevClusterSetup.ps1 -UseMachineName command and can confirm this is set correctly in the clustermanifest.xml

I have completely run out of ideas and without this working have no way of communicating between services. It seems like the service is in DNS but is not resolving to an IP

vipul-modi commented 6 years ago

If you have used -UseMachineName then the DNS server should not be running on localhost, but your IP. Added few folks to follow up.

mikkelhegn commented 6 years ago

A few issues already discussing DNS limitations: https://github.com/MicrosoftDocs/azure-docs/issues/14130 https://github.com/Azure-Samples/service-fabric-mesh/issues/19

Please let me know if any of those are helpful.

MarkDixonTech commented 6 years ago

I have actually now managed to get the reverse proxy working. I was only trying to use the Dns service because I was unable to use the Reverse proxy within Docker container, see Stack Overflow post below.

https://stackoverflow.com/questions/52007147/access-reverse-proxy-from-docker-container-on-service-fabric-on-dev-machine

Out of frustration, I gave the Reverse Proxy another try and it worked. What I have discovered is that if I change my local network settings (e.g. swap to a different Wifi connection) I am no longer able to connect to the fabric host from inside the container (at least on ports 1433 and 19081) until I restart my machine (probably restarting local Service Fabric would do the trick)

I'm still none the wiser why the Dns Service isn't working but haven't got time to investigate further since my preferred method is now working

mikkelhegn commented 6 years ago

There are situations where we cannot detect the NIC change and your containers end up running with a gateway or dns server ip, which is the old ip of the host machine.

MarkDixonTech commented 6 years ago

That makes sense, thanks. Whilst you're there, any idea how to update the Virtual Machine Scale set Image to 1709 in Service Fabric? I try and update the Json but it give me an error saying that changing the offer isn't allowed. If I don't change the offer type then it can't find the 1709 sku

danielwgrech commented 5 years ago

Any update on this? We have a similar issue where the service A (Docker container) cannot communicate with service B (also Docker Container) in Service Fabric. The address being attempted is in the following format: fabric:/{machineName}:19081/{SfApplicationName}/{ServiceName}/{path}

The error we get is: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

It works perfectly if I manually paste the address in a Windows 10 browser (Chrome) on my machine, but it seems like the address cannot be reached from inside the container of service A hosted on Service Fabric. (please note this only happens on our devs' machines; it's working fine on Azure). I restarted my machine to no avail.

Our Windows image of both Docker containers is also version 1809.

dario-ms commented 5 years ago

+Kayla

Thanks,

Dario


From: danielwgrech notifications@github.com Sent: Sunday, May 12, 2019 11:28 AM To: Azure/service-fabric-issues Cc: Dario Bazan Bejarano; Assign Subject: Re: [Azure/service-fabric-issues] Unable to resolve service DNS name (#1262)

Any update on this? We have a similar issue where the service A (Docker container) cannot communicate with service B (also Docker Container) in Service Fabric. The address being attempted is in the following format: fabric:/{machineName}:19081/{SfApplicationName}/{ServiceName}/{path}

The error we get is: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

It works perfectly if I manually paste the address in a Windows 10 browser (Chrome) on my machine, but it seems like the address cannot be reached from inside the container of service A hosted on Service Fabric. (please note this only happens on our devs' machines; it's working fine on Azure). I restarted my machine to no avail.

Our Windows image of both Docker containers is also version 1809.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fservice-fabric-issues%2Fissues%2F1262%23issuecomment-491618087&data=01%7C01%7Cdariopb%40microsoft.com%7Cacf5039a1f5c41b007b408d6d707a537%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=QA6wMyTsPCTwjaEz0UZFifKBaZzij0GT%2BcfFTGisQxA%3D&reserved=0, or mute the threadhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAINEJHW5KZDDBXQXR76PN6TPVBOVHANCNFSM4FTEBA5Q&data=01%7C01%7Cdariopb%40microsoft.com%7Cacf5039a1f5c41b007b408d6d707a537%7C72f988bf86f141af91ab2d7cd011db47%7C1&sdata=zQuh0oddtkG59pLV8lx5nU5yADFhoP%2B2gvUgGFeBa6o%3D&reserved=0.

MarkDixonTech commented 5 years ago

I have found I get different behaviour in Docker from my machine running Windows 10 natively to my Windows 10 instance running on Bootcamp. On the Mac, I cannot resolve the host from inside the container. It also depends on which WIFI connection I join and whether this happens before or after Docker starts.

aldesou commented 3 years ago

@MarkDixonTech Are you still interested in getting SF DNS working?

@danielwgrech Please open a new issue if you have not already and the issue still occurs.