CESNET / netopeer2

NETCONF toolset
BSD 3-Clause "New" or "Revised" License
300 stars 188 forks source link

Observed Crash while connecting Netopeer2Server with Third party Cli #1562

Closed AravindaSwamy closed 3 months ago

AravindaSwamy commented 5 months ago

Hi Team,

We have successfully compiled the latest version of Netopeer. While we are able to connect to the Netopeer2Server using Netopeer2Cli with this compiled code, attempting to connect with third-party applications like VS Code or Atom Editor results in a segmentation fault (seg fault) within Netopeer2Server.

Netopeer2Server logs :-

root@CGOLT# sudo LD_LIBRARY_PATH=lib bin/sysrepotool.sh sbin/netopeer2-server -d -v3 -t 10000
[INF]: SR: Datastore copied from <startup> to <running>.
[INF]: SR: Connection 1 created.
[INF]: SR: Session 1 (user "root", CID 1) created.
[INF]: SR: Triggering "ietf-netconf-server" "done" event on enabled data.
[INF]: LN: Listening on 0.0.0.0:10830 for SSH connections.
[INF]: SR: Triggering "ietf-keystore" "done" event on enabled data.
[INF]: SR: Triggering "ietf-truststore" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.

[INF]: LN: Accepted a connection on 0.0.0.0:10830 from 24.1.1.2:45220.
[INF]: LN: Received an SSH message "request-service" of subtype "ssh-userauth".
[INF]: LN: Received an SSH message "request-auth" of subtype "none".
[INF]: LN: Received an SSH message "request-auth" of subtype "password".
bin/sysrepotool.sh: line 32: 17661 Segmentation fault      (core dumped) $tool_name $*

Your insights on this issue would be appreciated.

Thanks, Aravind.

michalvasko commented 5 months ago

This should be fixed in the current libnetconf2 devel, try it please.

AravindaSwamy commented 5 months ago

Hi Michal,

We compiled the Netopeer2 development branch successfully. After starting the Netopeer2 server, we connected to the Netopeer2 CLI without any issues. However, when attempting to connect using VS Code installed on the same PC, we encountered connection problems. Please advise if there are any steps or configurations we may have overlooked.

Below is the Server Log

INF]: SR: Connection 20 created.
[INF]: SR: Session 23 (user "root", CID 20) created.
[INF]: SR: Triggering "ietf-netconf-server" "done" event on enabled data.
[INF]: LN: Listening on 0.0.0.0:830 for SSH connections.
[INF]: SR: Triggering "ietf-keystore" "done" event on enabled data.
[INF]: SR: Triggering "ietf-truststore" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.
[INF]: SR: Triggering "ietf-netconf-acm" "done" event on enabled data.

[INF]: LN: Accepted a connection on 0.0.0.0:830 from 127.0.0.1:34506.
[INF]: LN: Received an SSH message "request-service" of subtype "ssh-userauth".
[INF]: LN: Received an SSH message "request-auth" of subtype "none".
[INF]: LN: Received an SSH message "request-auth" of subtype "interactive".
[INF]: LN: User "root" authenticated.
[INF]: LN: Received an SSH message "request-channel-open" of subtype "session".
[INF]: LN: Received an SSH message "request-channel" of subtype "subsystem".
[INF]: SR: Session 31 (user "root", CID 20) created.
[INF]: SR: There are no subscribers for "ietf-netconf-notifications" notifications.
[INF]: NP: Generated new event (netconf-session-start).
[INF]: SR: EV ORIGIN: "/ietf-netconf-monitoring:get-schema" "rpc" ID 1 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf-monitoring:get-schema" "rpc" ID 1 priority 0 processing (remaining 1 subscribers).
[INF]: NP: Module "ietf-datastores@<any>" was requested.
[INF]: SR: EV LISTEN: "/ietf-netconf-monitoring:get-schema" "rpc" ID 1 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf-monitoring:get-schema" "rpc" ID 1 priority 0 succeeded.
[INF]: NP: Session 1: thread 1 event new RPC.
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 1 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 1 priority 0 processing (remaining 1 subscribers).
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 1 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 1 priority 0 succeeded.
[INF]: NP: Session 1: thread 1 event new RPC.
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 2 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 2 priority 0 processing (remaining 1 subscribers).
[INF]: SR: EV LISTEN: "/ietf-netconf:get" "rpc" ID 2 priority 0 success (remaining 0 subscribers).
[INF]: SR: EV ORIGIN: "/ietf-netconf:get" "rpc" ID 2 priority 0 succeeded.
[INF]: NP: Session 1: thread 1 event new RPC.

Below is the VS code netconf server list details :-


"netconf.serverList": [
        {

            "id": "zy1",  //device name

            "host": "0.0.0.0",  //olt ip

            "port": 830,

            "username": "root",

            "password": "admin123"

        }
    ]

Please let us know your inputs on this.

Thanks You.

michalvasko commented 5 months ago

What are "connection issues"? The server has not printed anything unexpected, the connected client should work correctly.

AravindaSwamy commented 5 months ago

Hi Michal,

Sorry for the wrong log. Please find the below updated log.

Netopeer Server Log :-

[INF]: LN: Accepted a connection on 0.0.0.0:830 from 127.0.0.1:42856.
[INF]: LN: Received an SSH message "request-service" of subtype "ssh-userauth".
[INF]: LN: Received an SSH message "request-auth" of subtype "none".
[INF]: LN: Received an SSH message "request-auth" of subtype "password".
[INF]: LN: User "root" does not have password method configured, but a request was received.
[INF]: LN: Failed user "root" authentication attempt (#1).
[2024/04/18 22:25:37.840937, 1] ssh_packet_disconnect_callback:  Received SSH_MSG_DISCONNECT: 11:
[ERR]: LN: Communication SSH socket unexpectedly closed.

VS Code Server List :-

    "netconf.serverList": [

        {

            "id": "zy1",  //device name

            "host": "0.0.0.0",  //olt ip

            "port": 830,

            "username": "root",

            "password": "admin123"

        }
    ]

Please let us know your inputs.

Thank You.

michalvasko commented 5 months ago

I see you are using a fairly old netopeer2 version that is loading users from the local system. In your case root is not allowed to log int using a password so the same behavior is observed when trying to connect to netopeer2. If you use the current version, all the authorized users are configured in the server YANG data so you can easily add root there with whatever password.

AravindaSwamy commented 5 months ago

Hi Michal,

We compiled the latest Netopeer2 devel branch and launched the Netopeer2 server binary generated in the build directory. With Netopeer2Cli we are able to connect with root credentials. But if we try to connect to VS code, Server is throwing error.

Thanks.

michalvasko commented 5 months ago

Sorry, yes, the output is from the current version. Note that the CLI is probably not using password authentication but actually keyboard-interactive, which usually only asks for the user password as well. Use sysrepocfg -X -m ietf-netconf-server to see the current server configuration. In the examples there is an XML file that you can use to check and change the server configuration to set the authentication you want.

AravindaSwamy commented 5 months ago

I have updated the "ssh_listen.xml" file as below and started the netopeer2 server and tried connect but still it is not connecting.

<users>
                <user>
                  <name>user</name> <!-- User name that can use this authorized key(s) to authenticate itself -->
                  <password>admin123</password>
                </user>
</users>

And below is the Netopeer server configuation :-

<netconf-server xmlns="urn:ietf:params:xml:ns:yang:ietf-netconf-server">
  <listen>
    <endpoints>
      <endpoint>
        <name>default-ssh</name>
        <ssh>
          <tcp-server-parameters>
            <local-address>0.0.0.0</local-address>
          </tcp-server-parameters>
          <ssh-server-parameters>
            <server-identity>
              <host-key>
                <name>default-key</name>
                <public-key>
                  <central-keystore-reference>genkey</central-keystore-reference>
                </public-key>
              </host-key>
            </server-identity>
            <client-authentication>
              <users>
                <user>
                  <name>root</name>
                  <keyboard-interactive xmlns="urn:cesnet:libnetconf2-netconf-server">
                    <use-system-auth/>
                  </keyboard-interactive>
                </user>
              </users>
            </client-authentication>
          </ssh-server-parameters>
        </ssh>
      </endpoint>
    </endpoints>
  </listen>
</netconf-server>

I have observed that Authentication method is not updated in configuration. Can you share your inputs on this.

Thanks.

michalvasko commented 5 months ago

Well, yes, you need to actually write the configuration into sysrepo. Run sysrepocfg -Evim -m ietf-netconf-server and then edit the configuration the way you want.

AravindaSwamy commented 5 months ago

Hi Michal,

We have successfully connected to the Netopeer2 Server using VS Code. Additionally, we tested SSH Call Home with the same devel branch and confirmed its proper functionality. However, when attempting to test TLS Call Home and pushing the ssh_keystore.xml script, Netopeer2Server crashes. Below are the logs for reference.

[INF]: NP: Session 1: thread 1 event new RPC.
[INF]: SR: EV ORIGIN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 processing (remaining 1 subscribers).
[INF]: NP: edit-config error-option "stop-on-error" not supported, rollback-on-error will be performed.
[WRN]: SR: Recovered a read-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-upgr-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-upgr-lock of CID 2 (sr_shmmod_lock).
Segmentation fault (core dumped)

Please share your inputs on this.

Thanks.

michalvasko commented 5 months ago

Are you saying you are using the current devel branch of netopeer2? What about the other projects? But I cannot really help you, I doubt I could reproduce the crash so it will be a problem on your end.

AravindaSwamy commented 5 months ago

Yes, We are using current devel branch of libnetconf2, libyang , netopeer2 and sysrepo.

michalvasko commented 5 months ago

I see, can you then please explain the messages

[WRN]: SR: Recovered a read-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-upgr-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-lock of CID 2 (sr_shmmod_lock).
[WRN]: SR: Recovered a read-upgr-lock of CID 2 (sr_shmmod_lock).

They should definitely not cause a crash but it means the server was not properly terminated.

AravindaSwamy commented 5 months ago

After clearing the cache and restarting the server, we pushed the tls_keystore.xml script, but the crash persisted, as indicated by the following traces. But whatever mentioned above traces were not coming.

[INF]: SR: EV ORIGIN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 for 1 subscribers published.
[INF]: SR: EV LISTEN: "/ietf-netconf:edit-config" "rpc" ID 1 priority 0 processing (remaining 1 subscribers).
[INF]: NP: edit-config error-option "stop-on-error" not supported, rollback-on-error will be performed.
Segmentation fault (core dumped)
michalvasko commented 5 months ago

Okay, sorry, you are right, there was an unexpected bug from a PR. Should be fixed, use the latest libyang devel.

AravindaSwamy commented 5 months ago

Hi Michal,

Thank you for your assistance. The issue has been resolved in the latest devel branch. However, I encountered an error while attempting to change the authentication method within the OLT. Here is the corresponding error log:

root@CGOLT# sysrepocfg -Evi -m ietf-netconf-server
[ERR] Data model "bbf-alarm-types@2020-10-13" not found in local searchdirs.
[ERR] Loading "bbf-alarm-types" module failed.
sysrepocfg error: Failed to connect (libyang error)
For more details you may try to increase the verbosity up to "-v3".

Could you please provide guidance on resolving this issue?

Thank You, Aravind

michalvasko commented 5 months ago

The error seems rather odd. My first thought is that you have mixed sysrepo installations and the corresponding repository paths, which is why an installed module cannot be found.

AravindaSwamy commented 4 months ago

Hi Michal,

Can you please share your input on the above issue.

Thank You, Aravind.

michalvasko commented 4 months ago

I did, I have no other ideas. Try to fully reinstall sysrepo and try again.

AravindaSwamy commented 4 months ago

Is there a method to compile the netopeer2 Server code with Password Authentication enabled by default? If yes, in which file can we modify the default authentication mode?

Thank you.

michalvasko commented 4 months ago

You cannot compile netopeer2 with any "default" configuration, all of the relevant behavior is controlled by YANG data read from sysrepo. Depending on how exactly you set it, you can modify it. By default it is applied by a script as part of make install and then you can modify it using sysrepocfg, for instance.

AravindaSwamy commented 4 months ago

So, it's not possible to alter the authentication during compilation. However, is it feasible to switch the authentication method to password during startup like we are configuring the local host ip and etc... ?

michalvasko commented 4 months ago

Yes, that is exactly how the authentication is configured.

AravindaSwamy commented 4 months ago

Is there a manual method available to change the authentication settings instead of using the sysrepocfg command? We're attempting to set up the server inside the OLT, where sysrepocfg isn't functional.

michalvasko commented 4 months ago

Yes, I have a similar use-case. You can put the desired configuration into the source as a string, parse it with lyd_parse_data_mem(), then pass the obtained libyang data to sr_edit_batch() and finally apply them with sr_apply_changes(). This will set the configuration of the server.

What exact configuration you will apply is up to you, some examples are in netopeer2/example_configuration/ssh_listen.xml.

AravindaSwamy commented 4 months ago

Hi Michal,

After copying the executables to the OLT, could you please let us know which libraries need to be copied for the sysrepocfg command to work? We will then copy those libraries from the locations where Sysrepo, Netopeer2, Libyang, and Libnetconf2 are installed.

Thank You.

michalvasko commented 4 months ago

Naturally, sysrepo and its dependencies. That should be libyang and that depends on PCRE2.

AravindaSwamy commented 4 months ago

Currently, these below sysrepocfg is pointing to.

root@CGOLT# ldd bin/sysrepocfg
        linux-vdso.so.1 (0x00007ffe49bf2000)
        libsysrepo.so.7 => /home/fs/lib/libsysrepo.so.7 (0x00007f60d438a000)
        libatomic.so.1 => /home/fs/lib/libatomic.so.1 (0x00007f60d4183000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f60d3f7f000)
        libyang.so.2 => /home/fs/lib/libyang.so.2 (0x00007f60d3c4b000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f60d3a2e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f60d3683000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f60d4627000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f60d3382000)
        libpcre2-8.so.0 => /home/fs/lib/libpcre2-8.so.0 (0x00007f60d3111000)

So, We are suspecting any libs are missed to copy to the OLT to resolve below error.

root@CGOLT# LD_LIBRARY_PATH=lib bin/sysrepocfg -Evi -m ietf-netconf-server -v3
[ERR] Data model "bbf-alarm-types@2020-10-13" not found in local searchdirs.
[ERR] Loading "bbf-alarm-types" module failed.
sysrepocfg error: Failed to connect (libyang error)

Please share your inputs on this.

Thank You.

michalvasko commented 4 months ago

You would get an error saying a library is missing if that was the case. What are you doing exactly? It seems like a cross-compilation based on your questions but at the same time you have executed sysrepocfg on the host system and are then copying "everything" to the target device? If so, that is a wrong way of cross-compiling. You need to run all sysrepoctl and sysrepocfg commands on the target device.

AravindaSwamy commented 4 months ago

Yes, We have compiled in different environment and Copied the libs from that environment to target device.

root@CGOLT# LD_LIBRARY_PATH=lib bin/sysrepotool.sh bin/sysrepoctl --install sysrepo/yang/bbf-alarm-types\@2020-10-13.yang  -v3
[INF] Connection 5 created.
[WRN] Module "bbf-alarm-types" is already in sysrepo.
root@CGOLT# LD_LIBRARY_PATH=lib bin/sysrepocfg -Evi -m ietf-netconf-server
[ERR] Data model "bbf-alarm-types@2020-10-13" not found in local searchdirs.
[ERR] Loading "bbf-alarm-types" module failed.
sysrepocfg error: Failed to connect (libyang error)
For more details you may try to increase the verbosity up to "-v3".

I have installed all yangs and libs from environment to target device. But still the error is persisting. Old Sysrepo version "2.2.12" we didnt faced this kind of issue. but in devel branch only we are facing.

michalvasko commented 4 months ago

I suppose there had to be some relevant changes, no matter. It points to the fact that you were not doing it correctly before either, it just happened to work and was bound to break sooner or later. Cross-compilation is not officially supported but I also have such a use-case so some support was added. How to do it depends mainly on whether your target system is a Linux or an embedded device. In any case, do not run any sysrepoctl or sysrepocfg on the host device, they must be run on the target device. That includes compiling netopeer2 with SYSREPO_SETUP=OFF and installing all the YANG modules on the target device.

jktjkt commented 4 months ago

Basically, any time you're "copying libraries to an OLT", or running with $LD_LIBRARY_PATH or any similar overrides, you're introducing error-prone, manual hacks to something which should have been done by a very smooth process. How was your system deployed originally? Why don't you simply re-spin your image build with a newer/fixed libyang/sysrepo/whatever and use that?

Cross-compilation is not officially supported

Since this might confuse a potential reader, and because cross-compiling is what "everybody" is doing (except those who either have no production deployments, a.k.a. "a developer's laptop", or those who are building x86_64 packages), I would like to rephrase this a little bit. Cross-compiling works just fine of course, but since it's a process that's outside of scope of libyang/sysrepo/libnetconf2/netopeer2, it is up to the party that's doing the cross-compiling to use proper tooling. One such example is Buldroot, or Yocto. Over and over again we've seen people who are just starting with sysrepo and also starting with cross-compiling. It might be a good idea to either learn how to produce repeatable, reproducible and working cross-compilation first, and then apply that knowledge to the NETCONF stack.

I think that what @michalvasko was saying is that asking for troubleshooting of a broken cross-compiling setup here is outside of scope of this project.

AravindaSwamy commented 4 months ago

When will this devel branch be released?

michalvasko commented 4 months ago

Current netopeer2 devel? Hard to say, from a few weeks to a few months.

AravindaSwamy commented 4 months ago

I mean the Fix for

  1. Netopeer2Server Crash while trying to connect from thirdparty cli.
  2. Netopeer2Server crash observed while pushing tls_keystore.xml script.

Are these released or yet to release?

michalvasko commented 4 months ago

I think both are in devel only for now, which has not been released yet.

AravindaSwamy commented 4 months ago

Ok Thanks Michal. We have found the root cause for the below issue.

root@CGOLT# LD_LIBRARY_PATH=lib bin/sysrepotool.sh bin/sysrepoctl --install sysrepo/yang/bbf-alarm-types\@2020-10-13.yang  -v3
[INF] Connection 5 created.
[WRN] Module "bbf-alarm-types" is already in sysrepo.
root@CGOLT# LD_LIBRARY_PATH=lib bin/sysrepocfg -Evi -m ietf-netconf-server
[ERR] Data model "bbf-alarm-types@2020-10-13" not found in local searchdirs.
[ERR] Loading "bbf-alarm-types" module failed.
sysrepocfg error: Failed to connect (libyang error)
For more details you may try to increase the verbosity up to "-v3".

While executing the Sysrepocfg command, libyang tries to search for the YANG modules in the path where the code was compiled. However, that path does not exist on the OLT. To resolve this, we created a similar directory structure to the compilation path and copied the YANG and .perm files into that path. As a result, we are now able to change the authentication method on the OLT.

Need a few inputs on the following points:

  1. We can change the Searchdir path for libyang? If yes, how we can set the searchdir for libyang?
  2. With the same image we have tested ssh callhome. But facing issue in forming ssh session. below is the netopeer2server error.
[INF]: LN: Call Home client "default-client" timeout of 5 seconds expired, reconnecting.
[INF]: LN: Trying to connect via IPv4 to 127.0.0.1:4334.
[INF]: LN: Successfully connected to localhost:4334 over IPv4.
[ERR]: LN: SSH key exchange timeout.
  1. We have tested TLS Callhome. But facing below error while forming TLS session.
Netopeer2Server Logs :
[INF]: LN: Call Home client "default-client" endpoint "default-tls" connecting...
[INF]: LN: Trying to connect via IPv4 to 127.0.0.1:4335.
[INF]: LN: Successfully connected to localhost:4335 over IPv4.
[ERR]: LN: TLS accept failed (tlsv1 alert unknown ca).

Netopeer2Cli Logs
cmd_listen: Waiting 60s for a TLS Call Home connection on port 4335...
nc ERROR: Server certificate error (unable to get certificate CRL).
nc ERROR: TLS connection to "(null)" failed (certificate verify failed).
cmd_listen: Receiving TLS Call Home on port 4335 failed.

Please Share your inputs.

Thanks, Aravind.

michalvasko commented 3 months ago

We can change the Searchdir path for libyang? If yes, how we can set the searchdir for libyang?

I think we are actually talking about sysrepo repository. Please adjust REPO_PATH during compilation to point to wherever you want YANG modules and data of sysrepo to be stored.

[ERR]: LN: SSH key exchange timeout.

Usually caused by an old libssh version so I can only suggest you update it.

[ERR]: LN: TLS accept failed (tlsv1 alert unknown ca).

You have not correctly imported all the required certificates. Look into netopeer2 README for an example working TLS configuration.

AravindaSwamy commented 3 months ago

I think we are actually talking about sysrepo repository. Please adjust REPO_PATH during compilation to point to wherever you want YANG modules and data of sysrepo to be stored.

But if i bringup Netopeer2Server, Sysrepo is pointing to the Correct directories where yangs are placed. But only if i execute that sysrepocfg command, the libyang is poiting to compiled directory.

michalvasko commented 3 months ago

Then you obviously have some library mix-up, I cannot help you with that.

AravindaSwamy commented 2 months ago

Hi Michal,

Is there any update on the Devel Branch Release?

Thank You, Aravind.

michalvasko commented 2 months ago

There is an explicit check for mixed repository paths in SHM and the set one, but that will not fix anything, just print a specific error (if that even is your problem).

AravindaSwamy commented 2 months ago

Hi Michal,

I don't understand what you're saying. When will the fix for the Netopeer2Server crash, which is currently in the devel branch, be released?

Thank You. Aravind.

michalvasko commented 2 months ago

Oh, sorry, there was a release made 3 weeks ago, I thought you knew about that.