WebThingsIO / gateway

WebThings Gateway - a self-hosted web application for monitoring and controlling a building over the web
http://webthings.io/gateway
Mozilla Public License 2.0
2.62k stars 339 forks source link

Implement Network Settings for Ubuntu Core - closes #3155 #3168

Closed benfrancis closed 1 week ago

benfrancis commented 2 months ago

Closes #3155

This PR implements a back end for network settings which works inside a snap running on Ubuntu or Ubuntu Core. It configures network settings using NetworkManager via its DBus API.

Previously all the Platform methods were synchronous but this introduces optional asynchronous versions of those methods.

The snap will require the network-manager and system-observe snap interfaces, which must be connected in order for it to work.

benfrancis commented 3 weeks ago

I think I've been working on this long enough that I should probably get someone to start reviewing it! Sorry this is such a huge patch but it made sense to do it all in one go.

@tim-hellhake I would value your input on this if you have time to review, if not let me know and I will try to find someone.

Note that the build is currently failing in CI because the dbus npm package needs libdbus to build (libdbus-1-dev apt package on Debian-based distros). I'm not exactly sure where to add that in the GitHub Action. Maybe https://github.com/WebThingsIO/gateway/blob/master/.github/workflows/build.yml#L33 ? But I feel like apt packages may be installed somewhere else as well.

If you want to give this a test drive then it should work on Ubuntu Deskop if you build from source locally as normal (need to apt install libdbus-1-dev first).


Generating a snap package from this branch to test on Ubuntu Desktop is a bit more involved but if you want to try it:

The snap currently runs on port 8080 by default.

The same snap should run on Ubuntu Core on the same architecture.

benfrancis commented 3 weeks ago

@tim-hellhake Thanks for the review!

I addressed your comment and managed to get it to build in CI by adding the DBus dependency to the GitHub Action via apt.

However, for some reason the integration tests appear to all pass but then the task times out after 6 hours! Any ideas what might be going on there?

benfrancis commented 3 weeks ago

It does appear to be something in this PR which is causing the tests to time out, but I'm a bit stuck figuring out what it is. The actual tests seem to pass but then jest times out and eventually gets killed.

The only idea I've had so far is that Dbus.getBus() is keeping something open that is preventing Jest from finishing. Maybe rather than opening that when the class is instantiated there should be some kind of init() method on Platform which is called when needed... but I'm really just guessing.

benfrancis commented 2 weeks ago

@tim-hellhake I'm struggling to diagnose what change is causing the test failures because in my local builds on Ubuntu all the integration tests are failing on master (even though they're passing in CI). If you are able to run a test run locally on either branch I'd be interested in the results.

Edit: I got the integration tests passing locally on master on Ubuntu Desktop after running npm start before npm run test. Still not sure why that's necessary but has caught me out lots of times so I should document it.

benfrancis commented 2 weeks ago

Ugh, getting the node-dbus npm package (added as a dependency in this PR) to build on MacOS may be challenging.

At the moment it won't build because node-dbus needs libdbus in order to compile the package using node-gyp.

I tried installing dbus with brew install dbus as suggested in the README of node-dbus, which worked but then I get a new error.

gyp ERR! build error 
gyp ERR! stack Error: `make` failed with exit code: 2
gyp ERR! stack     at ChildProcess.onExit (/Users/tola/.nvm/versions/node/v10.24.1/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:191:23)
gyp ERR! stack     at ChildProcess.emit (events.js:198:13)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:248:12)
gyp ERR! System Darwin 23.5.0
gyp ERR! command "/Users/tola/.nvm/versions/node/v10.24.1/bin/node" "/Users/tola/.nvm/versions/node/v10.24.1/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "configure" "build"
gyp ERR! cwd /Users/tola/Code/benfrancis/gateway/node_modules/dbus
gyp ERR! node -v v10.24.1
gyp ERR! node-gyp -v v5.1.0
gyp ERR! not ok 
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! dbus@1.0.7 build:release: `node-gyp configure build`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the dbus@1.0.7 build:release script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm WARN Local package.json exists, but node_modules missing, did you mean to install?

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/tola/.npm/_cacache/_logs/2024-11-06T18_05_18_745Z-debug.log
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! dbus@1.0.7 install: `npm run build:release`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the dbus@1.0.7 install script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/tola/.npm/_logs/2024-11-06T18_05_18_800Z-debug.log

I tried installing glib with brew install glib since that is also recommended in the node-dbus README, but I had to remove it again because it depends on a newer version of python3 which is known not to work with the gateway build process. I had planned to try to upgrade node-gyp to a newer version that works with Python 3.12 after the upgrade to node 20.

I'm currently stuck at that point, so haven't been able to reproduce the integration test failures on Mac OS, but also it currently means if we landed this PR as-is the gateway would no longer build on Mac OS.

benfrancis commented 2 weeks ago

OK, it's not super pretty but I've managed to get jest to close properly by disconnecting the system bus after all the tests have run. That's what I think was keeping the Node process running.

I'm going to take another look at getting this to build on Mac OS before merging.

benfrancis commented 2 weeks ago

I've managed to get DBus to build on Mac OS, by installing python3 and dbus via brew, and then adding the bundled versionless binary symlinks to my path in order to symlink python to python3. I'm not sure exactly why this is necessary, I thought it might be the old version of node-gyp that we're using is looking for Python 2, but I think this version of node-gyp was already added as a dependency of sqlite.

$ brew install python@3.10 dbus
$ echo 'export PATH=/opt/homebrew/opt/python@3.10/libexec/bin:$PATH' >> ~/.zprofile
$ source ~/.zprofile

After doing this npm ci successfully installs all the dependencies, including compiling DBus using node-gyp.

However, the gateway currently fails to start with npm start, with the errors below:

Opening database: /Users/tola/.webthings/config/db.sqlite3
dyld[70906]: missing symbol called
sh: line 1: 70906 Abort trap: 6           node build/app.js
npm ERR! code ELIFECYCLE
npm ERR! errno 134
npm ERR! webthings-gateway@1.2.0-alpha.1 start: `npm run build && node build/app.js`
npm ERR! Exit status 134
npm ERR! 
npm ERR! Failed at the webthings-gateway@1.2.0-alpha.1 start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/tola/.npm/_logs/2024-11-07T18_31_28_734Z-debug.log

This is all a bit silly because DBus isn't actually needed when running on Mac OS, so I was tempted to just add it as an optional dependency and not install node-dbus on Mac OS, but then we need to add fallbacks in the code and I'm not sure what other problems that might cause.

I'll try to take another look tomorrow, but any ideas welcome.

benfrancis commented 1 week ago

I'm just guessing but after researching this a bit further I think the issue here might be that I'm trying to build it on a MacBook Pro with Apple silicon (i.e. arm64 architecture), and there isn't a native version of Node.js 10 for arm64. The missing symbols may be due to node-gyp compiling native dependencies for the x64 architecture and then trying to run them on the arm64 architecture.

I tried working around this by running Terminal using Rosetta (to emulate i386), but it didn't work.

This maybe another thing that gets fixed when we finish upgrading to Node.js 20 in #3142 (since there are arm64 builds of newer versions of Node.js).

I'm not sure whether to wait for that upgrade (which could still take a while) before landing this PR, or just go ahead and land this PR in the interests of moving forward, and accept that builds on newer MacBooks will be broken for a while.

@tim-hellhake Do you have any thoughts?

benfrancis commented 1 week ago

Some good news is that having finally got WebThings Gateway to run on Node.js 20, I've discovered that it is upgraded the architecture mismatch problem is no longer an issue.

However, even when DBus compiles for the right architecture and I start the DBus service with brew services run dbus, the gateway still fails to start, showing DBus related errors.

I think rather than trying to get DBus to run on macOS where it's not actually needed, we need to just stop the gateway trying to use DBus when running on macOS.

The reason it currently does this is that the NetworkManager class opens a (single) connection to the system bus when it is instantiated, and the way that platform implementations are currently loaded means that they are all instantiated on start. I need to try and do some re-factoring so that the gateway only tries to connect to DBus if it's actually running on Ubuntu or Ubuntu Core...

benfrancis commented 1 week ago

OK I've managed to fix the macOS builds by refactoring the platform code to only try to open a connection to the DBus system bus on demand when NetworkManager is actually using it (which I should really have done in the first place), so now it will run properly on macOS. Phewf.

I've added some notes to the README to explain how to get it to build on macOS since libdbus is still needed as a build time dependency.

@tim-hellhake Rather than kick off another review cycle I'm going to squash this and land it so I can move on to getting the Node.js 20 upgrade landed, but feel free to let me know if there's anything you think I should have done differently and I can address that in a follow-up.