ShubhamTatvamasi / magma-galaxy

https://galaxy.ansible.com/shubhamtatvamasi/magma
BSD 3-Clause "New" or "Revised" License
13 stars 20 forks source link

failure with multiple NICs in server #7

Closed jblakley closed 1 year ago

jblakley commented 1 year ago

If there is more than one NIC in the system, the playbook always picks the first one listed by hostname -I. That can create a problem if the public IP is not assigned to the first NIC. Offending line in deploy-orc8r.sh is ORC8R_IP=$(hostname -I | awk '{print $1}'). This problem manifested with the k8s load balancer being assigned to the wrong IP.

ShubhamTatvamasi commented 1 year ago

Thanks for pointing out this issue @jblakley. I can think of a solution:

  1. User have 1 NIC.
    • Then select the default IP and move forward.
  2. User have multiple NICs.
    • We can list all the IPs and ask user to select one from the list.

Do you think this will solve the issue?

jblakley commented 1 year ago

That is certainly an option. Jan Harkes suggested a couple of ways to detect a NIC that has a public IP:

ip -o route get 8.8.8.8 | awk '{print $7}' or ip -j route get 8.8.8.8 | yq e '.[0]["prefsrc"]' -

These might create issues if there was no public IP assigned or there was more than one NIC with public IP.

ShubhamTatvamasi commented 1 year ago

Hi @jblakley, for fixing this issue if we use ip -o route get 8.8.8.8 | awk '{print $7} instead on hostname -I | awk '{print $1}', will it fix the issue for us? We are not dependent on public IP, it will also work on local IP. Only thing we need is internet. Multiple NICs is also not an issue, as we can go with the default route interface for selecting the IP. What do you suggest?

jblakley commented 1 year ago

@ShubhamTatvamasi It works in our case but because only one interface has internet access and that's the one we want.

In general, in the case with multiple interfaces, I think you're trying to ID the one where the orc8r services (e.g., nms) will ultimately be listening. In our case, that's the one with the public IP. However, in the cases with multiple NICs all with private IPs where those services will be listening on one of the private IP, I'm not sure how'd you do it without requiring the person deploying to pick one.

All I can think is that interface should be reachable by 1) all the AGW's to access the controller, bootstrapper, etc. and 2) by all clients who want to access nms. I can't think of a good way to automate that.

FYI, on our environment, hostname -I returns 4 IP addresses. Only the second is public and reachable by all clients and AGWs. The default route is to the IP of a gateway, not the orc8r.

If DNS is already configured, you could use nslookup to do a reverse lookup. I got this to work with: nslookup <FQDN>|grep Address|grep -v 127.0.0.53|awk '{print $2}'

But that's pretty kludgy.

ShubhamTatvamasi commented 1 year ago

We can always deploy it manually by specifying the IP in our hosts.yml file.

jblakley commented 1 year ago

That's probably the most straightforward approach. Your original proposal would also work.

ShubhamTatvamasi commented 1 year ago

Are you talking about https://github.com/ShubhamTatvamasi/magma-galaxy/issues/7#issuecomment-1256852367, where user can select the IP from the list?

jblakley commented 1 year ago

Yes.

ShubhamTatvamasi commented 1 year ago

Hi @jblakley, I have updated the code to use the IP from default gateway interface and user can also edit it before the deployment starts. Please see if that resolves this issue. Thanks

jblakley commented 1 year ago

Shubham, sorry I ever got around to testing this. With our system in production, its a little difficult to redeploy for testing.

On Thu, Apr 13, 2023 at 3:22 AM Shubham Tatvamasi @.***> wrote:

Hi @jblakley https://github.com/jblakley, I have updated the code to use the IP from default gateway interface and user can also edit it before the deployment starts. Please see if that resolves this issue. Thanks

— Reply to this email directly, view it on GitHub https://github.com/ShubhamTatvamasi/magma-galaxy/issues/7#issuecomment-1506477169, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTSNMFUHOMPYGMFDFKIMHTXA6SUZANCNFSM6AAAAAAQUIDGGA . You are receiving this because you were mentioned.Message ID: @.***>

--

Jim Blakley

Living Edge Lab Associate Director

Carnegie Mellon University