CESNET / netopeer2

NETCONF toolset
BSD 3-Clause "New" or "Revised" License
290 stars 186 forks source link

Netconf sometimes didn't callhome when I using "sr_set_item" to config the callhome info #1510

Open fei15115 opened 7 months ago

fei15115 commented 7 months ago

when I apply all the item I set,it just return with ok,then I trying to get the callhome information from the netopeer2-cli it is all correct as below: image But there is no any callhome print from the netopeer-server side. Then I trying to add some debug log from the libnetconf code and also the Netopper code.find out that the datachange callback is not called when that problem occurs please check the image as below: image

Should I add more log in sysrepo and figure out what's going on with the netconf data base?why the callback is not trigger when the data is changed.

michalvasko commented 7 months ago

Please post the previous configuration and the edit that you tried to apply when the callback was not triggered.

fei15115 commented 7 months ago

Hi Michalvasko, I will share the details later,currently the device is not near with me.But I could share the log where I apply in the source code,that may will help. In success cases: You could see that the print with "slave_au_call_home_client_ip = 192.168.50.21" is when I using the "sr_set_item" to setting the nodes,and callback is trigger as normal. image

In fails cases: In fails cases,we could see that the callback is not trigger,and we could always saw the "SSH key exchange timeout." which is using with another callhome node(please check the screen shot 2) image screen shot2: image where I add the print here: image

michalvasko commented 7 months ago

Okay, based on this output the actual problem is the SSH key exchange timeout, right? And not a non-triggered netopeer2 callback. The timeout used to appear with older libssh versions and also your NETCONF client may be at fault.

fei15115 commented 7 months ago

Hi Michalvasko, I just wondering if there is any relationship between the SSH key exchange and callback not trigger.(When the SSH key exchange timeout the netcard did not got an IP to connect to the client,So it is ok for that) We also find that if we did not config the "ietf-netconf-server" startup database(we have configed the another client callhome info in startup, not the same one, 192.168.50.1 is in startup, and 192.168.50.21 is what I want to config), then the callback will always trigger. Please let us know if you need more info to check this issue.

michalvasko commented 7 months ago

I am afraid I will really need the previous configuration and the edit you applied, when the new Call Home client has not been initiated, to reproduce it.

fei15115 commented 6 months ago

Hi Michalvasko, Please check the data store and code what wu using:

    > get-config --source startup

DATA

<data xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
  <keystore xmlns="urn:ietf:params:xml:ns:yang:ietf-keystore">
    <asymmetric-keys>
      <asymmetric-key>
        <name>genkey</name>
        <algorithm>rsa2048</algorithm>
        <public-key>MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAsiMKWxZYUrV+cjatGsLZ2iE/ZTWBIYJL4BqnwwR/y1CayDzcCJi/OXy2QQFfjuTHFLg82IG6xTObU4zVyofUWgkUBDNUHZOcj7DQtKcXvjNu5SLIeHYzN5t8EPRLzOTv9YAt82Bt8Z3H1OcHZDN7CFumfb1jNY9AvrH/2t7e/wC7BTIHV9An6GXVMUov1ckl4PJwaOq6clR2WzQg+rLPaI6yUaxgtX28i+uyE/rhRKY39lgev9GA8VBONkgokgSfsV4vlZ59W8Hc/LdSrkpusAlNVzm4nxyR+KO0jS2fygGv+NkNXrCFjXAQSRYGv9YnH3JGvQ8pap4mueB9o9ZENwIDAQAB</public-key>
        <private-key>MIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCyIwpbFlhStX5yNq0awtnaIT9lNYEhgkvgGqfDBH/LUJrIPNwImL85fLZBAV+O5McUuDzYgbrFM5tTjNXKh9RaCRQEM1Qdk5yPsNC0pxe+M27lIsh4djM3m3wQ9EvM5O/1gC3zYG3xncfU5wdkM3sIW6Z9vWM1j0C+sf/a3t7/ALsFMgdX0CfoZdUxSi/VySXg8nBo6rpyVHZbNCD6ss9ojrJRrGC1fbyL67IT+uFEpjf2WB6/0YDxUE42SCiSBJ+xXi+Vnn1bwdz8t1KuSm6wCU1XObifHJH4o7SNLZ/KAa/42Q1esIWNcBBJFga/1icfcka9Dylqnia54H2j1kQ3AgMBAAECggEALNlNnik+C3TOZQsHAhnUp2p/f4e8/ybC26VaX2sekZ21mnxYGYH7gwm3CR7DZEKNLNZF22yuDUA09xAaM2eHOzPV6kjSALBNVo/5u8Hl5GkmnmHnfwyjUHjA/7PB8dAh6flfmErK424KBbw0zDF55FzOyhGIjM+ISXqfJAfAtQqaJYuLe/D9Wi5V6/kdx7MHUUNrG+qiUzCCUDUjPCqpqmQgUZjgIrEeVwNMGFIHWwFv7HsASRkCTZuLLkW8sAoagoyH658Vad2iiAuy2rDDVK0NowFo8uBox8h2/xzqz8dOzXvD26ZCVKF5vc3GZVdrx0x0R21FQJIVMrCuKn4bIQKBgQDn9FrHj1A8Pw8LLGGOOVZ1OKRCq4VQgP0RI1DBNN7O67Z2tdUXoYsPU7eJuMHQ5jBnGYsWhS8fI0or1U5RQkIKx5XcsKjv8onZP8tnxRCgV0nhxsfvJMChisPItIOyoXpE/9wxXeOFYIQWDjvUPwIekCusAZ1fvdpmfug06GXlZQKBgQDEmncKC/PEg5xrZypjaesKdGFZgLxjVgWTgSDSoHAuFfuA13ZxrBniB/UOS/wwAPwRz2mZYyPJp1TggRvtsBjOburabq/ngh6wDeBEbWu7xKpIn4o9VzY2Lq18JNA//5kgjLbVjtMdLMz48BOpjJdx5XlwSNtHwOHybOZF88knawKBgB7fjFGxhpluPz9aeeWnRhW2I07oa4cqlAR68d21fs6F2zRzwVgy3UJ9/xjqqYl3igu+/59QvNPlK5MoAhOYwReUNyM3tFSzsJtk/VrjhPICjEfr4GK5PpaB1MtbE4hsK80RTSqY95aiIRKadGYsuMh+ogFz+ZFrwK0RyTB5mk5tAoGAVoUHh/NUlpG4v4dKHy/YkORAhyvhO/H6SDyWXjrew1lHMh8f78xmI5OO43jLBbEZPRlDBo6bjD3IW3hV+xb5A7fKQNBfNwmLSb0BifuBYsOckJMtOetsXxHRpQVqZA+uqqViPL865ub1WUQF0yKc7zGmbKSTY5Ndm9sSx4wOZPsCgYAMwWrObXN1O5CzWis3Hxh2txH1psdKnphC7ioRgs4ITgT85GHmCmnyt7JzLxOpLseut0IvUZcmmQWzJZ2ykgyaOJh8jf5g1z0EOMPMWLIEU9XzYRd3/x1+G2O1GzYz6HwKvbvctgxU6gtHoMCdUXK5X6H99Gg5lGjQmbMIWLYAoA==</private-key>
      </asymmetric-key>
    </asymmetric-keys>
  </keystore>
  <netconf-server xmlns="urn:ietf:params:xml:ns:yang:ietf-netconf-server">
    <listen>
      <endpoint>
        <name>default-ssh</name>
        <ssh>
          <tcp-server-parameters>
            <local-address>0.0.0.0</local-address>
            <keepalives>
              <idle-time>1</idle-time>
              <max-probes>10</max-probes>
              <probe-interval>5</probe-interval>
            </keepalives>
          </tcp-server-parameters>
          <ssh-server-parameters>
            <server-identity>
              <host-key>
                <name>default-key</name>
                <public-key>
                  <keystore-reference>genkey</keystore-reference>
                </public-key>
              </host-key>
            </server-identity>
            <client-authentication>
              <supported-authentication-methods>
                <publickey/>
                <passsword/>
                <other>interactive</other>
              </supported-authentication-methods>
            </client-authentication>
          </ssh-server-parameters>
        </ssh>
      </endpoint>
      <endpoint>
        <name>default-ssh2</name>
        <ssh>
          <tcp-server-parameters>
            <local-address>::</local-address>
            <keepalives>
              <idle-time>1</idle-time>
              <max-probes>10</max-probes>
              <probe-interval>5</probe-interval>
            </keepalives>
          </tcp-server-parameters>
          <ssh-server-parameters>
            <server-identity>
              <host-key>
                <name>default-key</name>
                <public-key>
                  <keystore-reference>genkey</keystore-reference>
                </public-key>
              </host-key>
            </server-identity>
            <client-authentication>
              <supported-authentication-methods>
                <publickey/>
                <passsword/>
              </supported-authentication-methods>
            </client-authentication>
          </ssh-server-parameters>
        </ssh>
      </endpoint>
    </listen>
    <call-home>
      <netconf-client>
        <name>default-client</name>
        <endpoints>
          <endpoint>
            <name>default-ssh</name>
            <ssh>
              <tcp-client-parameters>
                <remote-address>192.168.50.1</remote-address>
                <remote-port>4334</remote-port>
                <keepalives>
                  <idle-time>1</idle-time>
                  <max-probes>10</max-probes>
                  <probe-interval>5</probe-interval>
                </keepalives>
              </tcp-client-parameters>
              <ssh-server-parameters>
                <server-identity>
                  <host-key>
                    <name>default-key</name>
                    <public-key>
                      <keystore-reference>genkey</keystore-reference>
                    </public-key>
                  </host-key>
                </server-identity>
                <client-authentication>
                  <supported-authentication-methods>
                    <passsword/>
                  </supported-authentication-methods>
                </client-authentication>
              </ssh-server-parameters>
            </ssh>
          </endpoint>
        </endpoints>
        <connection-type>
          <persistent/>
        </connection-type>
      </netconf-client>
    </call-home>
  </netconf-server>
</data>

code is as below:

int Oran_Set_Item(sr_session_ctx_t *session,char* xpath,const sr_val_t *value,const sr_edit_options_t opts)
{
    int rc = SR_ERR_OK;
     /* set the value */
     printf("set %s\r\n",xpath);
    rc = sr_set_item(session, xpath, value, opts);
    if (rc != SR_ERR_OK) {
        fprintf(stderr, "\r\n\r\n[%s][path:%s]set error with[%d]\r\n\r\n",__func__,xpath,rc);
    }
    return rc;
}

uint8_t load_slave_call_home_ip_port(sr_session_ctx_t *session,char* slave_call_home_ip, uint16_t slave_call_home_port, char* slave_user_name, char* slave_au_password)
{
     sr_val_t *value = NULL;
     uint8_t count = 0;
     int rc = SR_ERR_OK;
     char xpath[300];
     sr_new_values(5,&value);

    value[count].type=SR_STRING_T;
    value[count].data.instanceid_val="genkey";
    memset(xpath,0, sizeof(xpath));
    sprintf(xpath, "/ietf-netconf-server:netconf-server/call-home/netconf-client[name='%s']/endpoints/endpoint[name='client']/ssh/ssh-server-parameters/server-identity/host-key[name='default-key']/public-key/keystore-reference\n", slave_user_name);
    rc = Oran_Set_Item(session, xpath, &value[count], SR_EDIT_DEFAULT);
    if (rc != SR_ERR_OK)
    {
        return rc;
    }
    count++;

    value[count].type=SR_STRING_T;
    value[count].data.instanceid_val="ssh";
    memset(xpath,0, sizeof(xpath));
    sprintf(xpath, "/ietf-netconf-server:netconf-server/call-home/netconf-client[name='%s']/connection-type/persistent\n", slave_user_name);
    rc = Oran_Set_Item(session, xpath, &value[count], SR_EDIT_DEFAULT);
    if (rc != SR_ERR_OK)
    {
        return rc;
    }
    count++;    

    value[count].type=SR_STRING_T;                
    value[count].data.string_val = slave_call_home_ip;
    memset(xpath,0, sizeof(xpath));
    sprintf(xpath, "/ietf-netconf-server:netconf-server/call-home/netconf-client[name='%s']/endpoints/endpoint[name='client']/ssh/tcp-client-parameters/remote-address\n", slave_user_name);
    rc = Oran_Set_Item(session, xpath, &value[count], SR_EDIT_DEFAULT);
    if (rc != SR_ERR_OK)
    {
        return rc;
    }
    count++;

    value[count].type=SR_UINT16_T;
    memset(xpath,0, sizeof(xpath));
    sprintf(xpath, "/ietf-netconf-server:netconf-server/call-home/netconf-client[name='%s']/endpoints/endpoint[name='client']/ssh/tcp-client-parameters/remote-port\n", slave_user_name);
    value[count].data.uint16_val = slave_call_home_port;
    rc = Oran_Set_Item(session, xpath, &value[count], SR_EDIT_DEFAULT);
    if (rc != SR_ERR_OK)
    {
        return rc;
    }
    count++;

    value[count].type=SR_LEAF_EMPTY_T;
    value[count].data.instanceid_val = slave_au_password;
    memset(xpath,0, sizeof(xpath));
    sprintf(xpath, "/ietf-netconf-server:netconf-server/call-home/netconf-client[name='%s']/endpoints/endpoint[name='client']/ssh/ssh-server-parameters/client-authentication/supported-authentication-methods/passsword\n", slave_user_name);
    rc = Oran_Set_Item(session, xpath, &value[count], SR_EDIT_DEFAULT);
    if (rc != SR_ERR_OK)
    {
        return rc;
    }

    rc = sr_apply_changes(session, 0);
    if (rc != SR_ERR_OK) {
        fprintf(stderr, "\r\n\r\napply name error with[%d]\r\n\r\n",rc);
    }
    return rc;
michalvasko commented 6 months ago

One last question I forgot to ask, what netopeer2 version are you using?

fei15115 commented 6 months ago

Hi Michalvasko, Version as below image

michalvasko commented 6 months ago

I have tested it in the current master, had the configuration you posted here and then ran your code to add another CH client, the server then started connecting to the other client as well.

fei15115 commented 6 months ago

As I said before, this phenomenon is not a must-occur, I found that it has something to do with whether to configure the startup database with sysrepocfg (within 10S), because when I use the sysrepocfg command, this is very easy to reproduce, but when I don't use it, it is more difficult to reproduce, where can I add some prints to help us find the root of the problem?

michalvasko commented 6 months ago

Tell me how to reproduce it with sysrepocfg, I just need to reproduce it, I do not care how.

fei15115 commented 6 months ago

Have you config the startup datastore(via sysrepocfg) everytime before the app(second callhome app) is up? such as below command: sysrepocfg --edit=/drgfly/etc/ssh_callhome_change.xml --format=xml --datastore=startup -m ietf-netconf-server -l -v3 reproduce cases: 1.keep the startup datastore like as above. 2.using sysrepocfg to config the startup datastore and running datastore. 3.using API to config the second callhome IP.

If you have better suggestions for debugging prints, I'll add prints locally and collect prints in an environment where they can be reproduced to speed up the process of solving our problems

michalvasko commented 6 months ago

I just need to reproduce it and I do not think any additional messages would help. So it should be enough if you write the exact commands with all data that you executed.

fei15115 commented 6 months ago

command :" sysrepocfg --edit=/drgfly/etc/ssh_callhome_change.xml --format=xml --datastore=startup -m ietf-netconf-server -l -v3"

fei15115 commented 5 months ago

Is there any update on this issue?or could we add some debug log on code to check for that ?

michalvasko commented 5 months ago

I am not able to reproduce this even though, based on what you wrote, it should not be difficult, so I am not able to help or suggest anything.