bdbcat / o-charts_pi

GNU General Public License v2.0
7 stars 11 forks source link

USB Dongle not detected when many Docker containers are running #48

Closed elafargue closed 4 months ago

elafargue commented 6 months ago

I recently started to get errors with a USB license dongle not getting detected. After spending quite some time debugging this, it looks like oexserverd crashes when there are too many network interfaces present on the host, which is typically the case when many Docker containers are running.

When oexserverd crashes, it leaves a file in tmp that is slightly above 4096 bytes in length: I suspect that this process allocates a 4k buffer somewhere when scanning for all network interfaces, and crashes if the result is above 4k.

An easy way to reproduce on a system where this happens is to stop containers until oexserverd -t works again.

Would it be possible to fix this? If this is a buffer overflow, it can be fixed fairly easily...

bdbcat commented 6 months ago

The 4K file leftover is a named pipe. Fixing might or might not be trivial. Certainly not a priority. For reference, how many docker containers are required in order to provoke this error?

elafargue commented 6 months ago

It's more a matter of how many network interfaces and bridges are created by docker containers. I can see crashes with 16 or more interfaces, which on my system corresponds to 14 containers. Typically Grafana, SignalK, InfluxDB and Home Assistant supervised will get you in that region pretty quick.

I suspect this is the root cause behind #44 as well as some of the ongoing conversations on cruiserforums where the same problem happens for some people using Docker and not others, so it would be awesome to raise the priority :)

sailing12388 commented 4 months ago

Those are the docker containers in running and I have this issue.

elafargue commented 4 months ago

With the latest Docker updates, it looks like oexserverd just fails whenever Docker is running, which is a major show stopper... Is it possible to prioritize a fix of this problem? It looks like it is affecting several others in various ways as well...

As of today, the only way I can start OpenCPN with o'charts is to stop Docker entirely, which means not running a bunch of critical services on my installation :(

Thanks!

bdbcat commented 4 months ago

Please confirm this is some (what?) variety of linux.

I want to understand the correlation between USB dongle access, and basic oexserverd operation. In the OCPN logfile, there is reporting on the oexserverd presence and response to simple communication. With no dongle detected, logfile will contain this on startup: (flatpak here)

23:51:10.711 MESSAGE o-charts_pi.cpp:648 Path to oexserverd is: /home/dsr/.var/app/org.opencpn.OpenCPN/bin/oexserverd
23:51:10.801 MESSAGE fpr.cpp:98 oexserverd results:
23:51:10.801 MESSAGE fpr.cpp:101 0
23:51:10.801 MESSAGE o-charts_pi.cpp:653 No Dongle detected

As you access charts, there may be other references to oexserverd in the logfile. Will you examine (or post) your logfile for confirmation?

sailing12388 commented 4 months ago

It isn't only with the USB dongle. It's the copy protection they are using because the same thing happens with their "regular" license checking plugin.

bdbcat commented 4 months ago

@sailing12388 OK, that's good information. I understand that you are not using a dongle, and see the same type of failure. What is your system configuration? linux, I guess. What distro, etc. ? How does the failure appear on your system? What, if any, error messages do you get?

bdbcat commented 4 months ago

@sailing12388 Just so you know, I am the author of this plugin. So the more detail you can provide here, the sooner we will get to a solution.

sailing12388 commented 4 months ago

It was all detailed in this post. https://www.cruisersforum.com/forums/f134/o-charts-frustrations-282532.html

bdbcat commented 4 months ago

OK, thanks for the link. In that thread, the USB dongle access is discussed. As we can see, there are lots of ways that USB device configuration and access can fail on linux.

First rule of debugging: Work with the simplest configuration that reliably reproduces the problem.

Lets simplify, remove USB from the matrix, so we can get after the Docker container question. I am more concerned about the failures when "standard" encryption method is used. Do I understand correctly that when you try to install and use a chart set with standard encryption, you will eventually get the "Your System has changed..." message?

sailing12388 commented 4 months ago

Yes, that's correct. If I never start the docker service, I never get that message. I don't remember the exact message I get but I have to recreate the fingerprint. Sometimes every three days and sometimes every 3 hours. There should be support tickets detailing the issue that led to the suggestion to purchase the dongle. I think I ended up purchasing 3 or 4 licenses for the same area trying to get it to work.

elafargue commented 4 months ago

It looks like we are looking at two different things here:

The underlying issue appears to be tied to the fact Docker creates a bunch of virtual network interfaces for its operations, which causes issues both in oexserverd for the USB dongle, and on the system fingerprint for software licenses. What do you think @bdbcat ? I'm happy to do any additional tests you think are relevant.

bdbcat commented 4 months ago

I think we are on the right track here. I should have some test code for you to try in the next day or so. It would be useful to me if you could startup your full docker workload, report the results of:

$ sudo ifconfig -a

Thanks

elafargue commented 4 months ago

Here are two ifconfig -a outputs:

ifconfig-a.txt ifconfig-a-good.txt

o-charts commented 4 months ago

Please @sailing12388 could you send us a system fingerprint or write us from your account at o-charts to take a look to your fingerprints there?

sailing12388 commented 4 months ago

Yes, give a bit. I need to reinstall the plugin, etc.

sailing12388 commented 4 months ago

Haha. I got in a password merry go round and now can't reset my password for a while so I gave up. I moved on to a different charting/navigation solution months ago because of all the problems I was having with o-charts and OpenCPN in general so I'm no longer using these products.

I was just hoping to help solve elafargue's issue.

bdbcat commented 4 months ago

@elafargue Here is a provisional test version of oexserverd, built for x86_64 arch. https://www.dropbox.com/scl/fi/gmb7gs5i7zem2afw26kct/oexserverd?rlkey=ikrgm3r2yd1xg4g67xm08wvp9&st=02a6yhan&dl=0

Please copy this file to ~/.local/bin/opencpn, chmod +x, and test.

$ chmod  +x ~/.local/bin/oexserverd 
$ ~/.local/bin//oexserverd -a
 oexserverd Version 1.24

Then run OpenCPN/o-charts as normal, and report results. Thanks

elafargue commented 4 months ago

Thanks! I should have mentioned it before, sorry, I am on aarch64 (arm64), would it be possible to compile for that architecture? I don't have access to an x86 box right now :(

sailing12388 commented 4 months ago

I'll give this a test in the morning. Thank you!

sailing12388 commented 4 months ago

Still doesn't work. I'm system identifier it says dongle not present. The system also won't accept my password. It just says void password. It logs into the website just fine though.

o-charts commented 4 months ago

We need your username at o-charts to look into this, please contact us here: https://o-charts.org/shop/en/contactenos

bdbcat commented 4 months ago

elafargue...

Here is ARM64 build flavor: https://www.dropbox.com/scl/fi/c85vyv7peb7ejr6j92mvf/oexserverd?rlkey=6pn8ajj9vq60gn7ypjepfzo88&st=1o134v12&dl=0

Thanks Dave

elafargue commented 4 months ago

Yay, just tested with the new oexserverd and it seems to have fixed the issue on my computer 👍

On my system, running with the original oexserverd the map can't be opened (can't find the dongle) and OpenCPN complains. When switching to this new version, the map opens without any issue and the dongle is found.

Thanks for your support here, very much appreciated! May I ask what the root cause was? Just curious :)

bdbcat commented 4 months ago

@elafargue Good news indeed. Root cause: The network interfaces created by docker are transient, and arrive in unpredictable order. This confuses the DRM code, making the decryption of the charts undependable. Solution: We now detect and ignore interfaces created by docker, so the system becomes as though docker is not active, nor are any docker containers present.

Thanks for your patience. Dave

sailing12388 commented 4 months ago

Yay, just tested with the new oexserverd and it seems to have fixed the issue on my computer 👍

On my system, running with the original oexserverd the map can't be opened (can't find the dongle) and OpenCPN complains. When switching to this new version, the map opens without any issue and the dongle is found.

Thanks for your support here, very much appreciated! May I ask what the root cause was? Just curious :)

Did you reboot after "installing" the new oexserverd?

elafargue commented 4 months ago

Did you reboot after "installing" the new oexserverd?

I did, yes, and checked that the original oexserved failed as expected, then used the patched oexserver. Note that I am using a dongle, not a software license.

sailing12388 commented 4 months ago

I have both the dongle and a software license and right now, neither work but I didn't reboot after applying this fix.

elafargue commented 4 months ago

I would separate concerns here - can you remove the software license for now, and test with only the dongle?

If you are using flatpak, you will need to update oexserverd in ~/.var/app/org.opencpn.OpenCPN/bin/, not in ~/.local/bin . Also make sure it is executable (chmod +x oexserverd).

I have both the dongle and a software license and right now, neither work but I didn't reboot after applying this fix.

sailing12388 commented 4 months ago

I'm not running the flat pack. The software license won't install so that's not an issue. I'll reboot later today.

sailing12388 commented 4 months ago

It's working!

bdbcat commented 4 months ago

Close, please?

elafargue commented 4 months ago

Closing as requested - do you have an ETA on uploading the fixed version on the repository and including in the next o-charts plugin release?

thanks again!