FailOver App seems mostly undocumented.

Leopere commented 1 year ago

I'd really like to see if I can untangle the failover app. I'm not able to find any documentation on how precisely to use it.

My target functionality will be to have a number of Reverse Proxies 3 currently more potentially later. I would like FailOver to detect if any of these proxies are offline. If they are take one of these IP's out of service.

ShreyasZare commented 1 year ago

Thanks for the post. There is unfortunately no documentation available for a lot of features in the project. I can however guide you here.

To use the failover feature, you will need to have a primary zone configured on the DNS server. You can then add an APP record for Failover app with Failover.Address as the classpath. In there you will find a json template which you will need to edit to configure your 3 reverse proxy servers as shown below:

{
  "primary": [
    "1.1.1.1",
    "2.2.2.2",
    "3.3.3.3"
  ],
  "secondary": [
  ],
  "serverDown": [
  ],
  "healthCheck": "https",
  "healthCheckUrl": "https://www.example.com/",
  "allowTxtStatus": true
}

In the above json config for the APP record, the primary array lists your 3 reverse proxy servers. The app will return only the healthy IP addresses. The healthCheck parameter is https, which is the name of the health check from the App's main config. The healthCheckUrl will be the URL of the website which should return a positive response. The allowTxtStatus when set to true will allow you to check the health status by just querying for the domain name for TXT type. The response will be TXT records that tell you status for each server you have configured.

You can configure health check details in the app's main config which can be changed from the App section. You will find the Config button for the app which will show you the json config. In here, there are already a few default health checks configured which works for most cases. You can add or change them for your usecase. There is also email alerts that you can configure so that the app sends an email to your configured email server.

Test it out on a local instance on your laptop once and use the DNS Client tab on the web panel to test it out and query for TXT record to see its health status.

Leopere commented 1 year ago

Thanks so much for the wonderful psuedo documentation via git issue <3!

It appears to be working wonderfully.

The only other question that I can imagine I'd need answered about the Failover APP itself is if I'm using it in a 3 Technitium server arrangement with 1 primary and 2 secondary NS servers will only the Primary do the healthchecks and delegate down the line to NS2 and NS3? Or will each of them serve what they see as their own healthy records?

Leopere commented 1 year ago

Oh also curious about the DNS Healthcheck feature because I'm not certain precisely if it will work as I need. With http/s healthchecks I would need to individually query each individual IP/server and feeding it a single URL isn't useful for detecting downtime on all three of the IP's. I was thinking ping would be sufficient technically however it seems that its not actually sending a Ping packet and is in fact sending a DNS query or something on port 53 to the target instead.

When I'm using the Ping healthcheck it seems as though its failing to ping the DNS port 53 not ICMP/Ping.

; OPT=15: 00 17 31 35 2e 32 33 35 2e 31 33 2e 35 35 3a 35 33 20 72 63 6f 64 65 3d 53 45 52 56 46 41 49 4c 20 66 6f 72 20 77 68 6f 61 6d 69 2e 6e 74 72 63 2e 69 6f 20 41 ("..1.REDACTEDIP.1:53 rcode=SERVFA
IL for whoami.REDACTED.TLD A")

The cleanest log I can offer after attempting to query my DNS server cluster.

[2022-11-23 21:22:59 UTC] Logging started.
[2022-11-23 21:22:59 UTC] [REDACTEDIP:51823] [admin] All log files were deleted.
[2022-11-23 21:23:01 UTC] DNS Server successfully notified name server 'cloudns-denning.ntrc.io' for zone: ntrc.io
[2022-11-23 21:23:01 UTC] DNS Server successfully notified name server 'cloudns-dowdy.ntrc.io' for zone: ntrc.io
[2022-11-23 21:23:01 UTC] [108.162.240.40:63021] [UDP] System.FormatException: An invalid IP address was specified.
 ---> System.Net.Sockets.SocketException (22): Invalid argument
   --- End of inner exception stack trace ---
   at System.Net.IPAddressParser.Parse(ReadOnlySpan`1 ipSpan, Boolean tryParse)
   at System.Net.IPAddress.Parse(String ipString)
   at Failover.Address.GetAnswers(Object jsonAddresses, DnsQuestionRecord question, UInt32 appRecordTtl, String healthCheck, Uri healthCheckUrl, List`1 answers) in Z:\Technitium\Projects\DnsServer\Apps\FailoverApp\Address.cs:line 69
   at System.Dynamic.UpdateDelegates.UpdateAndExecuteVoid7[T0,T1,T2,T3,T4,T5,T6](CallSite site, T0 arg0, T1 arg1, T2 arg2, T3 arg3, T4 arg4, T5 arg5, T6 arg6)
   at Failover.Address.ProcessRequestAsync(DnsDatagram request, IPEndPoint remoteEP, DnsTransportProtocol protocol, Boolean isRecursionAllowed, String zoneName, String appRecordName, UInt32 appRecordTtl, String appRecordData) in Z:\Technitium\Projects\DnsServer\Apps\FailoverApp\Address.cs:line 187
   at DnsServerCore.Dns.DnsServer.ProcessAPPAsync(DnsDatagram request, IPEndPoint remoteEP, DnsDatagram response, Boolean isRecursionAllowed, DnsTransportProtocol protocol) in Z:\Technitium\Projects\DnsServer\DnsServerCore\Dns\DnsServer.cs:line 2045
   at DnsServerCore.Dns.DnsServer.ProcessAuthoritativeQueryAsync(DnsDatagram request, IPEndPoint remoteEP, DnsTransportProtocol protocol, Boolean isRecursionAllowed, Boolean skipDnsAppAuthoritativeRequestHandlers) in Z:\Technitium\Projects\DnsServer\DnsServerCore\Dns\DnsServer.cs:line 1985
   at DnsServerCore.Dns.DnsServer.ProcessQueryAsync(DnsDatagram request, IPEndPoint remoteEP, DnsTransportProtocol protocol, Boolean isRecursionAllowed, Boolean skipDnsAppAuthoritativeRequestHandlers, String tsigAuthenticatedKeyName) in Z:\Technitium\Projects\DnsServer\DnsServerCore\Dns\DnsServer.cs:line 1185

ShreyasZare commented 1 year ago

Thanks so much for the wonderful psuedo documentation via git issue <3!

It appears to be working wonderfully.

You're welcome. Good to know its working well.

The only other question that I can imagine I'd need answered about the Failover APP itself is if I'm using it in a 3 Technitium server arrangement with 1 primary and 2 secondary NS servers will only the Primary do the healthchecks and delegate down the line to NS2 and NS3? Or will each of them serve what they see as their own healthy records?

All the servers will do health checks by themselves. Make sure you have the Failover app installed on the secondary servers too else they will return ServerFailure response for queries.

Oh also curious about the DNS Healthcheck feature because I'm not certain precisely if it will work as I need. With http/s healthchecks I would need to individually query each individual IP/server and feeding it a single URL isn't useful for detecting downtime on all three of the IP's. I was thinking ping would be sufficient technically however it seems that its not actually sending a Ping packet and is in fact sending a DNS query or something on port 53 to the target instead.

When I'm using the Ping healthcheck it seems as though its failing to ping the DNS port 53 not ICMP/Ping.

The ping type for health check will use ICMP ping.

For HTTP/HTTPS type health check, the single URL is sufficient since you are hosting same content on all 3 servers. The health check probe uses the URL and will send HTTP query to all 3 IP addresses for the same domain and URL.

You can also use the TCP type health check which will just check if the port is open. Check the app's main config to find the tcp health check that is defined. You can change ports or add new json block for a separate health check definition.

The cleanest log I can offer after attempting to query my DNS server cluster.

From the error log it seems that you have either entered the IP address incorrectly in the json config or that you have used a domain name in place of IP address. This may be the reason why your ping probe is failing as it requires an IP address to work.

If you wish to use a domain name then you will need to use the Failover.CNAME class path for the APP record.

Also, I will recommend that you use the built in DNS Client tab on the DNS web panel to run test queries since the response it gives will include Extended DNS Errors that will give clues as to why something failed.

Leopere commented 1 year ago

If

For HTTP/HTTPS type health check, the single URL is sufficient since you are hosting same content on all 3 servers. The health check probe uses the URL and will send HTTP query to all 3 IP addresses for the same domain and URL.

You can also use the TCP type health check which will just check if the port is open. Check the app's main config to find the tcp health check that is defined. You can change ports or add new json block for a separate health check definition.

if you show a couple of examples of this we could potentially work on wrapping it all around into some documentation if you like I can help.

Leopere commented 1 year ago

This is what comes back from my failover APP record under whoami.ntrc.io

{
  "Metadata": {
    "NameServer": "cloudns-dowdy.ntrc.io (127.0.0.1)",
    "Protocol": "Udp",
    "DatagramSize": "43 bytes",
    "RoundTripTime": "2.31 ms"
  },
  "EDNS": {
    "UdpPayloadSize": 1232,
    "ExtendedRCODE": "ServerFailure",
    "Version": 0,
    "Flags": "None",
    "Options": []
  },
  "DnsClientExtendedErrors": [
    {
      "InfoCode": "NetworkError",
      "ExtraText": "cloudns-dowdy.ntrc.io (127.0.0.1) returned RCODE=ServerFailure for whoami.ntrc.io. A IN"
    }
  ],
  "Identifier": 5128,
  "IsResponse": true,
  "OPCODE": "StandardQuery",
  "AuthoritativeAnswer": false,
  "Truncation": false,
  "RecursionDesired": true,
  "RecursionAvailable": true,
  "Z": 0,
  "AuthenticData": false,
  "CheckingDisabled": false,
  "RCODE": "ServerFailure",
  "QDCOUNT": 1,
  "ANCOUNT": 0,
  "NSCOUNT": 0,
  "ARCOUNT": 1,
  "Question": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN"
    }
  ],
  "Answer": [],
  "Authority": [],
  "Additional": [
    {
      "Name": "",
      "Type": "OPT",
      "Class": 1232,
      "TTL": "0 (0 sec)",
      "RDLENGTH": "0 bytes",
      "RDATA": {
        "Options": null
      },
      "DnssecStatus": "Disabled"
    }
  ]
}

The records config looks like this.

{
  "primary": [
    "51.22.41.148",
    "51.22.41.149",
    "51.22.41.150"
  ],
  "secondary": [
    ""
  ],
  "serverDown": [
    ""
  ],
  "healthCheck": "ping",
  "allowTxtStatus": true
}

Finally the server at 51.22.41.150 has its ICMP ping blocked deliberately to simulate downtime and yet the IP seems to still be offered when I query the record with DIG. Also the other concern is I'm using tools like https://dnschecker.org/all-dns-records-of-domain.php?query=whoami.ntrc.io&rtype=A&dns=opendns and https://toolbox.googleapps.com/apps/dig/#A/ while still getting zero records back.

{
  "Metadata": {
    "NameServer": "one.one.one.one (1.1.1.1)",
    "Protocol": "Udp",
    "DatagramSize": "128 bytes",
    "RoundTripTime": "52.34 ms"
  },
  "EDNS": {
    "UdpPayloadSize": 1232,
    "ExtendedRCODE": "ServerFailure",
    "Version": 0,
    "Flags": "None",
    "Options": [
      {
        "Code": "EXTENDED_DNS_ERROR",
        "Length": "24 bytes",
        "Data": {
          "InfoCode": "NoReachableAuthority",
          "ExtraText": "at delegation ntrc.io."
        }
      },
      {
        "Code": "EXTENDED_DNS_ERROR",
        "Length": "53 bytes",
        "Data": {
          "InfoCode": "NetworkError",
          "ExtraText": "15.235.13.55:53 rcode=SERVFAIL for whoami.ntrc.io A"
        }
      }
    ]
  },
  "DnsClientExtendedErrors": [
    {
      "InfoCode": "NetworkError",
      "ExtraText": "one.one.one.one (1.1.1.1) returned RCODE=ServerFailure for whoami.ntrc.io. A IN"
    }
  ],
  "Identifier": 48827,
  "IsResponse": true,
  "OPCODE": "StandardQuery",
  "AuthoritativeAnswer": false,
  "Truncation": false,
  "RecursionDesired": true,
  "RecursionAvailable": true,
  "Z": 0,
  "AuthenticData": false,
  "CheckingDisabled": false,
  "RCODE": "ServerFailure",
  "QDCOUNT": 1,
  "ANCOUNT": 0,
  "NSCOUNT": 0,
  "ARCOUNT": 1,
  "Question": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN"
    }
  ],
  "Answer": [],
  "Authority": [],
  "Additional": [
    {
      "Name": "",
      "Type": "OPT",
      "Class": 1232,
      "TTL": "0 (0 sec)",
      "RDLENGTH": "85 bytes",
      "RDATA": {
        "Options": [
          {
            "Code": "EXTENDED_DNS_ERROR",
            "Length": "24 bytes",
            "Data": {
              "InfoCode": "NoReachableAuthority",
              "ExtraText": "at delegation ntrc.io."
            }
          },
          {
            "Code": "EXTENDED_DNS_ERROR",
            "Length": "53 bytes",
            "Data": {
              "InfoCode": "NetworkError",
              "ExtraText": "15.235.13.55:53 rcode=SERVFAIL for whoami.ntrc.io A"
            }
          }
        ]
      },
      "DnssecStatus": "Disabled"
    }
  ]
}

ShreyasZare commented 1 year ago

This is what comes back from my failover APP record under whoami.ntrc.io

Since the response is ServerFailure, there will be an error logged that will tell what went wrong. The issue is mostly due to the APP records's json which contains empty string in place of IP address for secondary and serverDown arrays. Try the config given below and it should work:

{
  "primary": [
    "51.22.41.148",
    "51.22.41.149",
    "51.22.41.150"
  ],
  "secondary": [
  ],
  "serverDown": [
  ],
  "healthCheck": "ping",
  "allowTxtStatus": true
}

Leopere commented 1 year ago

Clearing out the empty values seems to have replied better now. However if you navigate to https://whoami.ntrc.io it just hangs.

{
  "Metadata": {
    "NameServer": "cloudns-daniel.ntrc.io (127.0.0.1)",
    "Protocol": "Udp",
    "DatagramSize": "91 bytes",
    "RoundTripTime": "0.5 ms"
  },
  "EDNS": {
    "UdpPayloadSize": 1232,
    "ExtendedRCODE": "NoError",
    "Version": 0,
    "Flags": "None",
    "Options": []
  },
  "DnsClientExtendedErrors": [],
  "Identifier": 12552,
  "IsResponse": true,
  "OPCODE": "StandardQuery",
  "AuthoritativeAnswer": true,
  "Truncation": false,
  "RecursionDesired": true,
  "RecursionAvailable": true,
  "Z": 0,
  "AuthenticData": false,
  "CheckingDisabled": false,
  "RCODE": "NoError",
  "QDCOUNT": 1,
  "ANCOUNT": 3,
  "NSCOUNT": 0,
  "ARCOUNT": 1,
  "Question": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN"
    }
  ],
  "Answer": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "4 bytes",
      "RDATA": {
        "IPAddress": "51.22.41.149"
      },
      "DnssecStatus": "Disabled"
    },
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "4 bytes",
      "RDATA": {
        "IPAddress": "51.22.41.150"
      },
      "DnssecStatus": "Disabled"
    },
    {
      "Name": "whoami.ntrc.io",
      "Type": "A",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "4 bytes",
      "RDATA": {
        "IPAddress": "51.22.41.148"
      },
      "DnssecStatus": "Disabled"
    }
  ],
  "Authority": [],
  "Additional": [
    {
      "Name": "",
      "Type": "OPT",
      "Class": 1232,
      "TTL": "0 (0 sec)",
      "RDLENGTH": "0 bytes",
      "RDATA": {
        "Options": null
      },
      "DnssecStatus": "Disabled"
    }
  ]
}

Leopere commented 1 year ago

Eh yeah no that was the reply for about 30 seconds now just blank answer segment.

ShreyasZare commented 1 year ago

Eh yeah no that was the reply for about 30 seconds now just blank answer segment.

Ya, the first time there is no health data available so the app will return all primary addresses and start the health monitor. The response TTL is set to 30 sec so that the client side cache expires quickly and client tries to refresh the domain again.

Once the health data is available, the response will contain only healthy addresses. The health status via TXT is now showing the current health state as expected.

Leopere commented 1 year ago

That’s strange because the firewall is showing that ping is fully open so at this point I’m just not certain why it can’t get a healthy reply.

Do you think that maybe because this is in a container Ping isn’t replying back to the container for some reason?

i was thinking of just using a simple http retcode 200 OK bound to the host since it seems unreliable to get the https check to work.

    {
      "name": "http8888",
      "type": "http",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 8888,
      "url": null,
      "emailAlert": "default",
      "webHook": "default"
    },

ShreyasZare commented 1 year ago

That’s strange because the firewall is showing that ping is fully open so at this point I’m just not certain why it can’t get a healthy reply.

Try to test by using ping from the same server that is running the DNS server. I would however recommend using either TCP or HTTP(S) for health check since it tests if your web server is actually running so in cases when webserver crashes due to some reason then ping test wont be able to detect failure.

Leopere commented 1 year ago

I updated my last reply. I’d thought of trying the custom check for a simple web service on the hosts port 8888.

Leopere commented 1 year ago

Along with this

{
  "primary": [
    "51.22.41.148",
    "51.22.41.149",
    "51.22.41.150"
  ],
  "secondary": [
  ],
  "serverDown": [
  ],
  "healthCheck": "http8888",
  "allowTxtStatus": true
}

ShreyasZare commented 1 year ago

Specify the URL explicitly using the healthCheckUrl property in the APP record json else the app will use the queried domain name to create a url which may not work with your webserver if there is domain mismatch.

ShreyasZare commented 1 year ago

    {
      "name": "http8888",
      "type": "http",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 8888,
      "url": null,
      "emailAlert": "default",
      "webHook": "default"
    },

The above health check cannot use the port for http or https type checks. The port is only used for TCP and url is used for HTTP(S).

In the above config, since the URL is null, the app will check for url from the APP record. If that too is null then it will make up a default URL using the queried domain name and default port 80.

Leopere commented 1 year ago

Ok so if I tailor the TCP healthcheck to hit a http server at http://51.222.41.148:8888/ since ping seems to be failing do we think that might work?

ShreyasZare commented 1 year ago

Ok so if I tailor the TCP healthcheck to hit a http server at http://51.222.41.148:8888/ since ping seems to be failing do we think that might work?

You cannot use URL for TCP health check. For TCP health check the json will be:

{
      "name": "tcp8888",
      "type": "tcp",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 8888,
      "emailAlert": "default",
      "webHook": "default"
}

Leopere commented 1 year ago

https://dnsclient.net/#Recursive%20Query%20%7Brecursive-resolver%7D/whoami.ntrc.io/TXT/UDP/false

{
  "Metadata": {
    "NameServer": "cloudns-dowdy.ntrc.io (15.235.13.55)",
    "Protocol": "Udp",
    "DatagramSize": "487 bytes",
    "RoundTripTime": "264.58 ms"
  },
  "EDNS": {
    "UdpPayloadSize": 1232,
    "ExtendedRCODE": "NoError",
    "Version": 0,
    "Flags": "None",
    "Options": []
  },
  "DnsClientExtendedErrors": [],
  "Identifier": 0,
  "IsResponse": true,
  "OPCODE": "StandardQuery",
  "AuthoritativeAnswer": true,
  "Truncation": false,
  "RecursionDesired": false,
  "RecursionAvailable": false,
  "Z": 0,
  "AuthenticData": false,
  "CheckingDisabled": false,
  "RCODE": "NoError",
  "QDCOUNT": 1,
  "ANCOUNT": 3,
  "NSCOUNT": 0,
  "ARCOUNT": 1,
  "Question": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "TXT",
      "Class": "IN"
    }
  ],
  "Answer": [
    {
      "Name": "whoami.ntrc.io",
      "Type": "TXT",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "136 bytes",
      "RDATA": {
        "Text": "app=failover; addressType=Primary; address=51.22.41.148; healthCheck=tcp8888; healthStatus=Failed; failureReason=Connection timed out.;"
      },
      "DnssecStatus": "Disabled"
    },
    {
      "Name": "whoami.ntrc.io",
      "Type": "TXT",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "136 bytes",
      "RDATA": {
        "Text": "app=failover; addressType=Primary; address=51.22.41.149; healthCheck=tcp8888; healthStatus=Failed; failureReason=Connection timed out.;"
      },
      "DnssecStatus": "Disabled"
    },
    {
      "Name": "whoami.ntrc.io",
      "Type": "TXT",
      "Class": "IN",
      "TTL": "30 (30 sec)",
      "RDLENGTH": "136 bytes",
      "RDATA": {
        "Text": "app=failover; addressType=Primary; address=51.22.41.150; healthCheck=tcp8888; healthStatus=Failed; failureReason=Connection timed out.;"
      },
      "DnssecStatus": "Disabled"
    }
  ],
  "Authority": [],
  "Additional": [
    {
      "Name": "",
      "Type": "OPT",
      "Class": 1232,
      "TTL": "0 (0 sec)",
      "RDLENGTH": "0 bytes",
      "RDATA": {
        "Options": []
      },
      "DnssecStatus": "Disabled"
    }
  ]
}

Hitting a page that purely replies with a json object with the string OK in it.
This is online http://51.222.41.149:8888/ This is online http://51.222.41.148:8888/ and this is offline http://51.222.41.150:8888/ to simulate a test case scenario where the IP ending in 150 should not be returned and I'm still getting a failure.

This is with the APP record as follows and the Failover Config as follows. dnsApp.config

{
  "healthChecks": [
    {
      "name": "ping",
      "type": "ping",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "tcp8888",
      "type": "tcp",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 8888,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "tcp80",
      "type": "tcp",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 80,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "tcp443",
      "type": "tcp",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "port": 443,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "http",
      "type": "http",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "url": null,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "http8888",
      "type": "http",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "url": null,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "https",
      "type": "https",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "url": null,
      "emailAlert": "default",
      "webHook": "default"
    },
    {
      "name": "www.example.com",
      "type": "https",
      "interval": 60,
      "retries": 3,
      "timeout": 10,
      "url": "https://www.example.com",
      "emailAlert": "default",
      "webHook": "default"
    }
  ],
  "emailAlerts": [
    {
      "name": "default",
      "enabled": false,
      "alertTo": [
        "admin@example.com"
      ],
      "smtpServer": "smtp.example.com",
      "smtpPort": 465,
      "startTls": false,
      "smtpOverTls": true,
      "username": "alerts@example.com",
      "password": "password",
      "mailFrom": "alerts@example.com",
      "mailFromName": "DNS Server Alert"
    }
  ],
  "webHooks": [
    {
      "name": "default",
      "enabled": false,
      "urls": [
        "https://webhooks.example.com/default"
      ]
    }
  ],
  "underMaintenance": [
    {
      "network": "192.168.10.2/32",
      "enable": false
    },
    {
      "network": "10.1.1.0/24",
      "enable": false
    }
  ]
}

APP Record

{
  "primary": [
    "51.22.41.148",
    "51.22.41.149",
    "51.22.41.150"
  ],
  "secondary": [
  ],
  "serverDown": [
  ],
  "healthCheck": "tcp8888",
  "allowTxtStatus": true
}

ShreyasZare commented 1 year ago

The IP addresses in your APP record json config are not the same you are expecting. Seems like a typo in the second octet which is 22 but you are expecting 222. This could be the reason why ping check is also failing.

Leopere commented 1 year ago

I... well that seems to be working now.

ShreyasZare commented 1 year ago

yup, its working now.

Leopere commented 1 year ago

I see its culled a couple of the nodes pinging that IP.

I got so caught up in the flurry of trying things until they worked that I completely missed the fact that my domain record decayed somehow and lost that second octet. What a stupid problem. Anyways I guess the ultimate preference is to have an http aware option but I don't really want to have the Technitium instances having to ping the whole website every few minutes. Is it just checking for a header response or something?

ShreyasZare commented 1 year ago

I see its culled a couple of the nodes pinging that IP.

I got so caught up in the flurry of trying things until they worked that I completely missed the fact that my domain record decayed somehow and lost that second octet. What a stupid problem. Anyways I guess the ultimate preference is to have an http aware option but I don't really want to have the Technitium instances having to ping the whole website every few minutes. Is it just checking for a header response or something?

Ya, that happens some times lol.

Using ping check every minute wont be any issue. Any health check type you choose has to check the servers frequently so as to know when they go down/up asap to be able to respond with different set of records.

Right not the Failover app does not support checking for custom HTTP header. It only checks if the HTTP response is 200 OK. This should be fine for most cases.

Leopere commented 1 year ago

Right not the Failover app does not support checking for custom HTTP header. It only checks if the HTTP response is 200 OK. This should be fine for most cases.

Oh great that's wonderful.

Using ping check every minute wont be any issue. Any health check type you choose has to check the servers frequently so as to know when they go down/up asap to be able to respond with different set of records.

This is honestly going to be a wonderful day having had figured this out effectively. I'm going to plunk my way through testing the http/s options just in case now that I can confirm that its all working and viable. I just needed to see this work once to know that I'm not just going down a rabbit hole. I was confident your code worked but it was a bit tricky to see through the examples into viable implementation cases.

ShreyasZare commented 1 year ago

This is honestly going to be a wonderful day having had figured this out effectively. I'm going to plunk my way through testing the http/s options just in case now that I can confirm that its all working and viable. I just needed to see this work once to know that I'm not just going down a rabbit hole. I was confident your code worked but it was a bit tricky to see through the examples into viable implementation cases.

I would recommend that you use the HTTPS check since if your web server's TLS cert expires for example then that will be detected as a failure. Otherwise, the app will never know that the cert is expired and keep returning the IP address causing clients fail to load the website.

The Failover app is a bit tricky at first to configure since there is no GUI and no documentation. Having a test DNS server setup where you can test it out before deploying to prod is good way to avoid config issues.

Leopere commented 1 year ago

Hmm okay I'm just not really certain what to do here to make this go at this point Well

The Failover app is a bit tricky at first to configure since there is no GUI and no documentation. Having a test DNS server setup where you can test it out before deploying to prod is good way to avoid config issues.

Well hopefully this issue will provide both entertainment value and some degree of documentation going forward I tried including as much data in my replies as possible to ensure success for future users.

TechnitiumSoftware / DnsServer

FailOver App seems mostly undocumented. #498