sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
723 stars 1.38k forks source link

[202205][TACACS]: connection failed: transport endpoint is not connected #16362

Open nazariig opened 1 year ago

nazariig commented 1 year ago

Description

TACACS failed connect server during command authorization:

Aug 22 05:35:57.368281 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:35:58.063877 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:35:59.048913 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-lagmgdlmubcisorovntucbywabatjrdi ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671758.65-30114-43450976949403/AnsiballZ_command.py
Aug 22 05:35:59.049661 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:35:59.231737 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config tacacs delete 10.7.34.20
Aug 22 05:35:59.231860 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:35:59.738192 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:35:59.756673 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:00.511018 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-nddvugjoavcjzaaxwdeffkuvzvbznofa ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671760.07-30131-254543740818475/AnsiballZ_command.py
Aug 22 05:36:00.511341 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:00.684464 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config tacacs add 10.213.103.5
Aug 22 05:36:00.684560 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:01.199309 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:01.214812 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:02.468683 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-tcxmowabqqythhugzigfkcpunywksxnz ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671761.99-30151-126027372903386/AnsiballZ_command.py
Aug 22 05:36:02.469391 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:02.651640 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config tacacs authtype login
Aug 22 05:36:02.651726 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:03.137210 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:03.154082 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:03.851335 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-toxrffztydrjhpfahsuizqjybvpghcwj ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671763.47-30168-248099078470162/AnsiballZ_command.py
Aug 22 05:36:03.851775 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:04.031498 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config aaa authentication login tacacs+
Aug 22 05:36:04.031595 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:04.508660 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:04.527655 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:05.224504 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-iqzqaxlckimwihiykzeehdvbmj
zoaatf ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671764.83-30185-6073818652019/AnsiballZ_command.py
Aug 22 05:36:05.225060 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:05.411326 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config aaa authorization local
Aug 22 05:36:05.411412 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:05.919559 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:05.936338 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:06.665678 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-ihqscywgiqfaewwrtieuubbobptuhinw ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671766.26-30202-56572839365419/AnsiballZ_command.py
Aug 22 05:36:06.666103 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:06.825991 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/local/bin/config aaa accounting disable
Aug 22 05:36:06.826086 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:07.312690 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:07.327614 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:08.092491 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-svrktowtqacnugdclmwfrtaqfsqyoikj ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671767.65-30219-172989430823752/AnsiballZ_command.py
Aug 22 05:36:08.092882 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:08.253019 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/sbin/deluser test_louser
Aug 22 05:36:08.253099 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:08.354399 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:08.369374 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:09.142369 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-xiomtmofgvjvlgryfffyjqrldttilhzy ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671768.69-30239-19604451434569/AnsiballZ_command.py
Aug 22 05:36:09.142777 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:09.319520 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/sbin/useradd test_louser
Aug 22 05:36:09.319624 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:09.344545 r-leopard-58 ERR useradd[551849]: tac_connect_single: connection failed with 10.213.103.5:49: Transport endpoint is not connected
Aug 22 05:36:09.349236 r-leopard-58 INFO useradd[551849]: new group: name=test_louser, GID=1002
Aug 22 05:36:09.349376 r-leopard-58 INFO useradd[551849]: new user: name=test_louser, UID=1001, GID=1002, home=/home/test_louser, shell=/bin/sh, from=/dev/pts/0
Aug 22 05:36:09.362847 r-leopard-58 ERR useradd[551849]: tac_connect_single: connection failed with 10.213.103.5:49: Transport endpoint is not connected
Aug 22 05:36:09.373534 r-leopard-58 ERR useradd[551849]: tac_connect_single: connection failed with 10.213.103.5:49: Transport endpoint is not connected
Aug 22 05:36:09.398529 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:09.413390 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:10.160077 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-zekysdetkmlbdicdxqsyhmeqwqlscpbp ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671769.73-30256-246132968064108/AnsiballZ_command.py
Aug 22 05:36:10.160822 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:10.341041 r-leopard-58 NOTICE sudo:     root : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/usr/bin/echo test_louser:123456
Aug 22 05:36:10.341140 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=0)
Aug 22 05:36:10.342730 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:10.372533 r-leopard-58 NOTICE chpasswd[551879]: pam_unix(chpasswd:chauthtok): password changed for test_louser
Aug 22 05:36:10.390027 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:13.846955 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-ncvhzhvgadthiiiepqoopxrwmvwjitxq ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671773.4-30306-214186855977485/AnsiballZ_command.py
Aug 22 05:36:13.847757 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:14.031228 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root
Aug 22 05:36:16.072837 r-leopard-58 NOTICE sudo:    admin : TTY=pts/0 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-dsitgtnxuwwcuqdzllorshsoqhdoekbh ; /usr/bin/python /tmp/ansible-admin/ansible-tmp-1692671775.59-30343-110710999146594/AnsiballZ_command.py
Aug 22 05:36:16.073359 r-leopard-58 INFO sudo: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Aug 22 05:36:16.354517 r-leopard-58 INFO sudo: pam_unix(sudo:session): session closed for user root

SONiC TACACS plugin sonic-buildimage/src/tacacs/bash_tacplus/bash_tacplus.c code:

/*
 * Send tacacs authorization request.
 * This method based on send_tacacs_auth in https://github.com/daveolson53/tacplus-auth/blob/master/tacplus-auth.c
 */
int tacacs_authorization(
    const char *user,
    const char *tty,
    const char *host,
    const char *cmd,
    char **args,
    int argc)
{
    int result = 1, server_idx, server_fd, connected_servers=0;
    uint16_t task_id = (uint16_t)getpid();

    for(server_idx = 0; server_idx < tac_srv_no; server_idx++) {
        server_fd = tac_connect_single(tac_srv[server_idx].addr, tac_srv[server_idx].key, tac_source_addr, tac_timeout, __vrfname);
        if(server_fd < 0) {
            // connect to tacacs server failed
            output_error("Failed to connecting to %s to request authorization for %s: %s\n", tac_ntop(tac_srv[server_idx].addr->ai_addr), cmd, strerror(errno));
            continue;
        }

    ...

    return result;
}

TACACS API https://github.com/kravietz/pam_tacplus code:

/* return value:
 *   >= 0 : valid fd
 *   <  0 : error status code, see LIBTAC_STATUS_...
 */
int tac_connect_single(const struct addrinfo *server, const char *key, struct addrinfo *srcaddr, int timeout)
{
    int retval = LIBTAC_STATUS_CONN_ERR; /* default retval */
    int fd = -1;
    int flags, rc;
    struct pollfd pfd;
    socklen_t len;
    struct sockaddr_storage addr;
    char *ip;

    ...

    /* some other error or interrupt before timeout */
    if (rc < 0)
    {
        TACSYSLOG(LOG_ERR,
                  "%s: connection failed with %s: %m", __FUNCTION__, ip);
        goto bomb;
    }

    /* unexpected revent */
    if ((pfd.revents & (POLLIN | POLLOUT)) == 0)
    {
        TACSYSLOG(LOG_ERR,
                  "%s: poll failed with zero revents %s: %m", __FUNCTION__, ip);
        goto bomb;
    }

    /* check with getpeername if we have a valid connection */
    len = sizeof addr;
    if (getpeername(fd, (struct sockaddr *)&addr, &len) == -1)
    {
        TACSYSLOG(LOG_ERR,
                  "%s: connection failed with %s: %m", __FUNCTION__, ip);
        goto bomb;
    }

    ...

bomb:
    if (retval < 0 && fd != -1)
        close(fd);

    TACDEBUG(LOG_DEBUG, "%s: exit status=%d (fd=%d)",
             __FUNCTION__, retval < 0 ? retval : 0, fd);
    return retval;
} /* tac_connect_single */

Steps to reproduce the issue:

  1. Run test:
    py.test tacacs/test_ro_user.py --inventory="../ansible/inventory,../ansible/veos" --host-pattern r-leopard-58 --module-path ../ansible/library/ --testbed r-leopard-58-t1-lag --testbed_file ../ansible/testbed.csv --allow_recover  --assert plain --log-cli-level debug --show-capture=stdout --disable_loganalyzer -ra --showlocals -v --clean-alluredir --alluredir=/tmp/allure-results --allure_server_addr="10.215.11.120" --topology t1,any,util --allure_server_project_id=r-leopard-58-debug --skip_sanity --dynamic_update_skip_reason

Describe the results you received:

Aug 22 05:36:09.344545 r-leopard-58 ERR useradd[551849]: tac_connect_single: connection failed with 10.213.103.5:49: Transport endpoint is not connected

Describe the results you expected:

No errors expected

Output of show version:

Output of show techsupport:

Additional information you deem important (e.g. issue happens only occasionally):

prgeor commented 1 year ago

@adyeung as discussed can you reassign to broadcom.

adyeung commented 12 months ago

@jeff-yin can someone from DELL help take a look pls?

jeff-yin commented 12 months ago

@adyeung has this been root-caused to be a TACACS authentication issue? It appears this has more to do with authorization and AFAIK, Dell didn't implement TACACS per-command authorization.

I think this needs to be assigned to @liuh-80 from MSFT

jeff-yin commented 12 months ago

Or, could just be that these errors are due to failed attempts to connect to a server in a configured list of servers. Looks like tacacs_authorization will try a list of servers via multiple calls to tac_connect_single. If an unreachable server is tried before a reachable one, the error will be printed.

In that case, might this just be a testbed issue where one of the servers is not reachable, and therefore the error message is legit? @nazariig can you confirm?

BTW, I do not see the configured TACACS+ server in the techdump, so I can't tell if a list of servers was configured. The only TACACS+ server that was configured does not match the server from the error log in this Github ticket.

  "TACPLUS_SERVER|10.7.34.20": {
    "expireat": 1692672170.9226546,
    "ttl": -0.001,
    "type": "hash",
    "value": {
      "priority": "1",
      "tcp_port": "49"
    }
  },

Aug 22 05:36:09.344545 r-leopard-58 ERR useradd[551849]: tac_connect_single: connection failed with 10.213.103.5:49: Transport endpoint is not connected

liuh-80 commented 12 months ago

@nazariig , this seems a network issue, please check if the address 10.213.103.5 is reachable from your testbed first.

And if that IP address is reachable, please check if the tacplus server config and start correctly on that device. also the 49 port does not block by firewall.

Also seems you are running the TACACS test on your own testbed, so please check the sonic-mgmt config files, to make sure the testbed configuration are correct.

volodymyrsamotiy commented 8 months ago

This issue was discussed on triage meeting some time back but note was not added to this ticket so adding it here: Nazarii answered that he checked and it is not network issue, configuration files also look OK.