rsta2 / circle

A C++ bare metal environment for Raspberry Pi with USB (32 and 64 bit)
https://circle-rpi.readthedocs.io
GNU General Public License v3.0
1.85k stars 244 forks source link

Network Connections on 'localhost' #488

Open davefilip opened 1 week ago

davefilip commented 1 week ago

Rene,

It appears as though while I can readily open a CSocket (IPPROTO_TCP) connection to another network node (Circle or otherwise), I cannot to the local Circle node doing the Connect().

Also, perhaps not surprisingly, I can use the Dev branch to Ping other nodes, but not the node I'm on (not that it is very useful to ping yourself, but it seems to confirm that ICMP -- which does not use sockets -- also cannot talk to the network node you're running on).

Of course, I'm trying the actual network IP address of the node I'm on, but "traditionally" there is a loopback address of 127.0.0.1, which of course also does not work (didn't think it would, since it is not documented anywhere, but I thought I'd try nonetheless).

Is this intentional, or is there an architectural reason why connecting to the local network address of a Circle node does not work?

Just curious.

Thanks,

Dave.

rsta2 commented 1 week ago

There can be only one active network interface in Circle at a time. localhost would require another (loopback) interface. The address 127.0.0.1 does not exist. It was implemented for simplicity that way, and there was no use case so far, that I know, which required to have localhost.

davefilip commented 1 week ago

Thanks for the feedback

I understand the one interface limitation in Circle. Thanks for explaining.

There would be no need for localhost, if there were a way to connect to the local IP of the machine you’re running on. But it appears that this does not work.

And without that, I can probably come up with another way around it within Circle. But even before I heard the John Gage (1) quote (“The Network is the Computer”), I have always thought about computing resources as network nodes (2).

So a common design model I have used is to have services distributed across a network, and to access those regardless of where they run, either a local node or a remote node.

Most of what I have done over the past 25+ years has also been around writing and consuming APIs.

So how does this relate to a network of Raspberry Pis? I’ll tell you my thinking, as it might not be what something like Circle was originally designed for, although again, there are always workarounds in software if you are creative enough.

My vision is to be able to have a bunch of cheap (Zero W or Zero W 2) RPis each providing a “service”, like monitoring temperature and humidity, or monitoring water flow, or monitoring motion and ambient light for automatically turning lights (plural) on and off (all of which I currently have), etc. So the way I have addressed this in the past is to provide a network API (usually HTTP) to query the status of the sensor(s) on the IoT device (RPi).

The glitch is if I can’t make a network connection to the node I’m on, then I need a different (non-network) interface to communicate with services / sensors if they are local, as opposed to if they are remote. Which is a deviation from the model that everything is a network service, and can reside anywere, and can be moved anywhere, it doesn’t matter, because it’s just another network resource.

So in my current model — the generic model I have been developing — is that I can have a “cluster” of cheap $15 RPi devices, who discover each other (they currently send a ‘Hello’ message with their IP and list of services they provide every minute, and each nodes builds a list), with the goal that any service can be called on any node. But with Circle that model breaks if the service is local.

So you may be thinking that I should perhaps be using a full Linux OS instead of Circle (which I have), but that is a challenge for monitoring sensors on GPIO pins that change very quickly (such as water flow sensors), and also incredible slowness of running on smaller hardware (Zero Ws, and the handful of 1Bs and 2Bs I have, and may be out there).

So I always envisioned this project to be the “Middleware of Operating Systems” - which is perhaps a stretch of the term - in that it could be as about small and fast as something running on a cheap Arduino, but also had the infrastructure to be multitasking and “network aware” of its surroundings, have a simple and thin layer of security (for nearby WiFi connections), and be able to do things like send email and transfer files “out of the box”.

But wait, you may be thinking, once you put all that crap into the OS, you’ve made it overly complicated and added too much overhead! Because my other concept in a “Middleware of Operating Systems” is like the Build-a-Bear Workshop (3), I have a config header whereby I can configure exactly which services / tasks / functions I need, so I can quickly and easily create a custom kernel with only what I need for a particular purpose.

Sorry for being so long winded about this, but I wanted to explain my thinking, and why I am not just randomly picking things that a full blown OS like Linux / macOS / Windows has, and Circle doe not, because that is not the point, but trying to explain, if not justify my thinking and potential use cases.

All of that said, I understand that this may be directly as odds to the thinking that went into Circle, and not what Circle was meant to do, and making a network connection to the local machine may break the local network layer. And that is all good, as Circle is your thing, and you get to decide what it does and how it works, and nobody can argue that it is not incredibly awesome as-is.

Regards,

Dave.

(1)(2) I’m not actually sure when John Gage said it, although I loved it when I heard it, but when I was 17 and still in High School, I was still thinking this way, and had purchased a couple of Radio Shack Color Computers (the lowest end and cheapest you could get a machine that both played game cartridges AND could add floppy drives and write programs in something other than BASIC), hand soldered a nutl modem RS-232 serial cable between them, and wrote software to send messages between them … which is why my first domain name was colornet.com (a “network” between two R/S Color Computers).

(2) Not sure if they have Build-A-Bear Workshop in the NL (where I think you are?), but it is a place where kids can “build” a custom bear / rabbit / whatever stuffed animal by choosing different parts (body, arms, legs, head, clothes, etc.), and make a stuffed toy that is uniquely theirs. No, I have never been inside one of these myself, but they tend to be in most shopping malls around here, and I like the model that you can easily build exactly and only what you want.

On Oct 14, 2024, at 5:35 AM, Rene Stange @.***> wrote:

There can be only one active network interface in Circle at a time. localhost would require another (loopback) interface. The address 127.0.0.1 does not exist. It was implemented for simplicity that way, and there was no use case so far, that I know, which required to have localhost.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2410600261, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KSLBBLC4W5XQAWXRDDZ3OF7RAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJQGYYDAMRWGE. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

What you are describing is definitely a (sophisticated) possible use case for a loop back network interface in Circle. I'm preparing a new release now and will think about, how this can be realized afterwards.

davefilip commented 1 week ago

Thanks, and per my inquiry, loopback doesn’t need to be a new IP address (127.0.0.1 or something similar), but for my use case, even better if it is the “regular” IP address of the machine that can still talk to itself (although probably still considered a “loopback”, since the packets never reach any network device).

As per your last response, sounds like adding support for more than one network interface would add complications that this use does not justify.

And again, please don’t feel like you need to prioritize (or even do anything) for what might be a sophisticated edge use of Circle that might not affect many other users. I’m sure that I can find some sort of work-around, that’s what creative programmers do … ;-)

My (overly long) response was mainly because you had said "there was no use case so far, that I know”, so I wanted to let you know of one possible use case.

Cheers,

Dave.

On Oct 14, 2024, at 11:19 AM, Rene Stange @.***> wrote:

What you are describing is definitely a (sophisticated) possible use case for a loop back network interface in Circle. I'm preparing a new release now and will think about, how this can be realized afterwards.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2411576420, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KXFUCXI22CSFIAWXALZ3POIHAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJRGU3TMNBSGA. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

I have found a quick solution. It's on the develop branch. IP packets, sent to the own IP address, will be looped back to the IP receiver. Tested with test/ping-client and Unitest iperf. There is still no explicit localhost address.

davefilip commented 1 week ago

Rene,

Great, glad it was something that you could do somewhat easily without a lot of work. I’ll download it and give it a try, and let you know what I find.

Again, I don’t explicitly need or want localhost for my immediate use case … working with the real IP address is preferred, which means that I can treat services on the local computer exactly the same as any remote computer.

I suggested localhost as a possible work-around, since it is what a lot of other operating systems use, if it would have been easier. My thinking was that just creating a new static device in software might have been easier, but as you have pointed out, Circle can only handle one interface at a time, so that was a bad assumption on my part.

As always, I really appreciate your quick feedback on this, as I was starting to look into building a TCP proxy on remote node as a way to connect to local network services, but now I can scrap that idea.

Cheers,

Dave.

On Oct 15, 2024, at 6:31 AM, Rene Stange @.***> wrote:

I have found a quick solution. It's on the develop branch. IP packets, sent to the own IP address, will be looped back to the IP receiver. Tested with test/ping-client and Unitest https://github.com/rsta2/unitest iperf. There is still no explicit localhost address.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2413512520, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KQGMKQCVS6QWDUTMR3Z3TVIZAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJTGUYTENJSGA. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

Did this work for you?

davefilip commented 1 week ago

Rene,

Apologies for the delay … when I reached out to you I was in the middle of bunch of networking/clustering code (how I actually found that I couldn’t connect to the local IP), and have been working to get that all sorted.

Sorry, it is because I’m hesitant introduce any new bugs or instability from the Development branch, I try not to D/L and use new code that I don’t control until I’m happy that everything I’ve done is working correctly and stable. That’s just a habit I picked up through the years. And not that I don’t trust your Development code, as much as any assumptions I’ve made about integrating my code with yours.

I can commit to testing and getting back to you this weekend, if that is satisfactory?

Again, I appreciate the quick turn-around, and I am assuming that it will work as you say it will, as you are one of the cleanest open source developers I’ve ever known. I just have a process that I go through before committing code, and I am technically retired and just a part time hobbyist (well, kinda, as the driver behind this is a for-profit project a friend is building, and is hoping to make a lot of money from, and I am helping him build his prototype, as he is an inventor who knows a lot about water purification, but very little about computers and electronics).

I’ll get back to you tomorrow, or Sunday at the absolute latest. I will also do a thorough networking test, just to make sure that nothing that previously worked breaks, Again, not because I don’t trust your work, it is just a process I’ve learned to go through, because most of what I do (did professionally) is multi-threaded servers.

Cheers,

Dave.

On Oct 18, 2024, at 5:26 AM, Rene Stange @.***> wrote:

Did this work for you?

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2421955164, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KXYD4NZJS2AHHLV3RLZ4DH2ZAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRRHE2TKMJWGQ. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

No problem, Dave. I understand, that you have this development process. It's right, the the Circle develop branch is not tested that well, like the master branch. Just let me know, when you have a result. BTW. Your project sounds interesting.

Rene

davefilip commented 1 week ago

Rene,

I promised I would try to get back to you today, so I am.

Pinging (ICMP) and opening sockets (TCP) on the local address works in my environment for the following hardware configurations, and is stable:

RPi Zero 2W - WiFi RPi 1B - Ethernet RPi 2B - Ethernet RPi 3B - Ethernet RPi 3B - WiFi RPi 4B - Ethernet

However, when I try to build and run on a RPi 4B with WiFi, I have new WiFi connection problems (see first screenshot below), and if it does eventually connect (usually after 2 - 3 minutes of trying), I get frequent crashes with a stack trace the ends with a ’Synchronous exception’ error in red (see the second screenshot below). Note that the cashes appear fairly random, not always with the same command, which may work before the crash, but usually within a minute or two of running).

[On the hardware configuration where it is stable, I have no new WiFi errors or delays, and I can usually run a dozen or so commands without problem; on the RPi 4B WiFi, it takes much longer to connect to WiFi with new errors, and I get a Synchronous crash after 2 - 3 commands.]

Now, you may say that the changes you have provided have nothing to do with WiFi on a RPi 4B, so it could be something with my build environment. I struggled early on with building Circle only because I wanted some of the stdlib/newlib libraries, but once I build with stdlib/newlib (STDLIB_SUPPORT = 3), I can no longer build anything in the Circle /sample or /addon directories. So what I typically do with a new Circle distribution is:

$ vi Config.mk

./makeall clean ./makeall

$ cd addon/wlan $ ./makeall clean $ ./makeall

$ cd addon/wlan/sample/hello_wlan $ make clean $ make

… etc also do the same for /addon/fatfs, /addon/SDCard, and then …

$ vi Config.mk Add back STDLIB_SUPPORT = 3

./makeall clean ./makeall

So then I can use the standard Circle build scripts to build my project. I explicitly link with a few of the stdlib/newlib libraries to get things like TLS and some of the string functions (but NOT stdio, since stdlib/newlib requires the kernel inheriting from CStdlibAppStdio, which seems to break other things that I’ve done with Circle).

So I am sharing all of this because I am admitting that my build environment might be a bit weird, so if your changes would not affect RPi4 WiFi in any way, then I might be something that I need to fix on my end (I’ve been playing with various rebuilds for the past few hours and have not found a fix so far, but will keep trying).

So in summary, yes, what you have provided mostly works, with one exception that could be something weird in my environment.

Clear as mud? I will let you know if I figure out anything more.

Cheers,

Dave.

On Oct 18, 2024, at 5:26 AM, Rene Stange @.***> wrote:

Did this work for you? — Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2421955164, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KXYD4NZJS2AHHLV3RLZ4DH2ZAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRRHE2TKMJWGQ. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

Dave, thanks for your report!

Unfortunately currently I do not have an idea, how to find the reason for the problem with WiFi on the RPi 4B. I tried the hello_wlan sample in Circle (develop branch) and sample mbedtls/07-mqttclient in circle-stdlib (master branch and Circle develop branch in libs/circle/) in AArch32 and AArch64 modes again without a problem.

I don't think this has to do with the recent change for local connects, because this is a local modification, and one can estimate, that it has no consequences in other parts of the code.

Yes, your build system is a bit difficult to consider with this problem. Isn't it possible to generally use circle-stdlib and copy the header file from include/circle_stdlib_app.h to your own project and modify it for your purpose. This is, what other projects are doing. If you need support from circle-newlib or circle-mbedtls, you should build with circle-stdlib. Otherwise it is difficult for me to help.

BTW. Your screenshots are not displayed on GitHub.

davefilip commented 1 week ago

Rene,

Yeah, I think the problems I am seeing with WiFi / 4B are probably too deep down in the protocol stack to be affected by your recent changes. So your recent changes are probably good, but I wanted to let you know that I had some issues, which could be (but probably are?) unrelated.

I will continue on this end to try to resolve. Since the WiFi stack on 3B / Zero 2W work, it would seem to indicate something in my build environment for the 4B.

My problem is that I want to keep my project and build scripts clean under new releases of Circle, as stdlib/newlib tends to lag behind. But moreover, stay away from CStdlibAppStdio, and also be able to test and take advantage of /addon and /sample within the Circle distribution. The stdlib/newiib stuff doesn’t build/work unless I inherit from the CStdlibAppStdio class.

That said, I am using the TLS library from newlib/stdlib (my main reason for using it), and I have taken advantage of a few of the string functions (just because they came with the package), but not any of the stdin/stdout/stderr pieces which rely on CStdlibAppStdio and a different way (hidden / automatic) of initializing the core device classes within the kernel, and therefore hinders my access to them.

I have had discussions with Stephan about newlib/stdlib, and have built and tested his samples, but I am trying to keep my project much cleaner under Circle “out of the box”.

Nonetheless, to get back to the original subject, I think your recent work is probably good, and I have to fix something in my environment, given how specific the problem is (only WiFi on 4B).

But like a mechanic, who works on your car and tells you that everything is fine, but then you drive away and your front wheel falls off, and the mechanic says “Oh, I was wondering what clunking sound was, but didn’t think it was important?” … I didn’t want to not say anything.

Regards,

Dave.

On Oct 20, 2024, at 7:22 AM, Rene Stange @.***> wrote:

Dave, thanks for your report!

Unfortunately currently I do not have an idea, how to find the reason for the problem with WiFi on the RPi 4B. I tried the hello_wlan sample in Circle (develop branch) and sample mbedtls/07-mqttclient in circle-stdlib (master branch and Circle develop branch in libs/circle/) in AArch32 and AArch64 modes again without a problem.

I don't think this has to do with the recent change for local connects, because this is a local modification, and one can estimate, that it has no consequences in other parts of the code.

Yes, your build system is a bit difficult to consider with this problem. Isn't it possible to generally use circle-stdlib and copy the header file from include/circle_stdlib_app.h to your own project and modify it for your purpose. This is, what other projects are doing. If you need support from circle-newlib or circle-mbedtls, you should build with circle-stdlib. Otherwise it is difficult for me to help.

BTW. Your screenshots are not displayed on GitHub.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2424862310, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KR3HZVNU7XKPJBBFU3Z4OG7HAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRUHA3DEMZRGA. You are receiving this because you authored the thread.

rsta2 commented 1 week ago

Dave, of course there can be and probably will be bugs in Circle, like in most other software. This is not the question. The question is, how it can be sorted out, when there is a problem. You need to have some sound understanding about the environment, in which the problem occurs, otherwise it is not possible to catch it. And this is the situation at the moment for me, because you have a very special build setup, which I do not know in detail.

Even if it's necessary to inherit from CStdlibAppStdio, it is possible to modify this class by coping the header file, where it is defined, into your own project. I have done this before (e.g. for MiniDexed). But if you have a build environment, with which you are fine, of course you do not need to do this.

My problem at the moment is, that I have a ready tested (according to my test procedures) new Circle release waiting for the merge, and now a problem occurs, which is not as obvious as a wheel falling off, and which I cannot sort out (see above).

It was my mistake, that I wanted to have this local connect feature in the new release and not in the next one, because the modification was small, nothing could go wrong, and I thought, you were waiting for it. Sometimes it's better to stay hard as a maintainer. ;)

Please try to sort this out. If you have specific question, please ask. If the problem remains over the end of this month, I will do the release despite of it.

davefilip commented 1 week ago

Rene,

Sorry, don’t want to hold you up!

Unfortunately, we’re off to an Oktoberfest now, so I can’t focus on this immediately, however, I can share the following:

  1. I rolled back all of the changes that you have provided, and my project on RPi 4B + WiFi is now stable again (hitting it with lots of network packets, typing over a dozen commands etc., and all good, no longer crashes with Synchronous exception)

  2. The WiFi errors might be a red-herring because it always connects after about a minute and a half, and I can’t say with absolutely certainty that some of these messages while it tries and fails at a WiFi connection are new … or, more precisely, I haven’t done extensive testing with RPi 4B with WiFi, and again, it always connects within 2 minutes, and perhaps those errors have always been present, and I was incorrectly conflating instability and crashes with WiFi messages that don’t show up on 3B or Zero 2W.

  3. What I want to test - and this is just a working theory? - in that when I was doing my 4B + WiFi testing yesterday, as always do, having a window on my Mac doing continuous pings. I was also testing pinging other nodes and pinging itself within the project when I was testing yesterday and having those crashes.

  4. Since everything is stable without your patches — but also without any ping / ICMP — I was thinking that perhaps the problem is related to me pinging Circle (Circle ping server) continuously while also pinging out (Circle ping client)? When I get back today, I will test this by putting al ofl the patches back, and test w/o pinging Circle from my Mac, to see if that solves the stability problem / crashes.

Why could this show up on a 4B and not a 3B? Well, there is slightly different code for 4B networking (I believe prior to the 4B, networking when through the USB host?), and the 4B is faster, so perhaps it is a timing / race condition that gets hit? Just guessing here.

Nonetheless, not sure if this is the problem and my theory is correct, but I’ll give it a try when we get back, and let you know how it goes.

And to be clear, I did not test the ICMP/ping on the 4B + WiFi prior to yesterday, which contained both the ICMP/ping and loopback.

Regards,

Dave.

On Oct 20, 2024, at 11:06 AM, Rene Stange @.***> wrote:

Dave, of course there can be and probably will be bugs in Circle, like in most other software. This is not the question. The question is, how it can be sorted out, when there is a problem. You need to have some sound understanding about the environment, in which the problem occurs, otherwise it is not possible to catch it. And this is the situation at the moment for me, because you have a very special build setup, which I do not know in detail.

Even if it's necessary to inherit from CStdlibAppStdio, it is possible to modify this class by coping the header file, where it is defined, into your own project. I have done this before (e.g. for MiniDexed). But if you have a build environment, with which you are fine, of course you do not need to do this.

My problem at the moment is, that I have a ready tested (according to my test procedures) new Circle release waiting for the merge, and now a problem occurs, which is not as obvious as a wheel falling off, and which I cannot sort out (see above).

It was my mistake, that I wanted to have this local connect feature in the new release and not in the next one, because the modification was small, nothing could go wrong, and I thought, you were waiting for it. Sometimes it's better to stay hard as a maintainer. ;)

Please try to sort this out. If you have specific question, please ask. If the problem remains over the end of this month, I will do the release despite of it.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2425034126, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KUXGJVMT4AIA73TXJTZ4PBHNAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGAZTIMJSGY. You are receiving this because you authored the thread.

rsta2 commented 6 days ago

One important thing about pinging is, that after calling EnableReceiveICMP(TRUE) you have to continuously call ReceiveICMP() to remove all received ICMP packets from the queue. If you don't do this and someone is sending ICMP packets (e.g. pings) to your machine, memory will get exhausted sooner or later. Normally this will take longer, but not with a flood ping. Perhaps this can be a reason?

davefilip commented 6 days ago

Rene,

After I got back — and the beer wore off :-)— I did a lot more testing.

The gist is that I think your updates for ICMP / ping and loopback are good.

I found that regardless of whether I run a continue ping from my Mac to my Circle project, I get the same results (so my prior theory was wrong).

The ’Synchronous exception’ - which is what I am concerned about - only occurs if I run commands on the “console” of my project, being the USB keyboard and HDMI display on the RPI4. Therefore, I believe what I am seeing is untreated to your networking updates.

For background, I have written a command shell that runs either with HDMI+Keyboard or over the network via a telnet server (I modified your Echo server to run on port 23 and accept and run commands).

What is interesting is that on a RPi 3B - which is where I do most of my development - I can run and type commands for 10+ minutes on either the HDMI+Keyboard or a telnet session, and not have any crashes.

When I run on a RPi 4B, I can run and type commands for 10+ minutes on a telnet session, and not have any crashes. But when I type commands on HDMI+Keyboard, I can usually get a Synchronous exception in 2 - 3 minutes.

When I am running various commands, I am pinging external sites (Google), pinging other Circle nodes running my project, pinging my own node, running commands remotely on other Circle nodes running my project, running commands through a network socket on the same Circle node, as well as getting external HTTP/HTTPS URLs, and running local commands like FATFS directory listings (just for good measure, and to exercise the code).

Unfortunately, RPi 4B on HDMI+Keyboard I can reliably get a Synchronous exception. RPI 4B on a network (telnet) session, I throw lots of commands at it, and run continuous pings at it, and no crashes.

Likewise, as I said, RPi 3B on either HDMI+Keyboard or a network (telnet) session, I throw lots of commands at it, and run continuous pings at it, and no crashes.

I have noticed that when I run my Circle project on a RPi 4B, that the left-hand margin is about an inch and a half in, and the font is smaller, than when I run on a RPi 1B, 2B, 3B, or Zero 2W. Which doesn’t bother me, but is a difference I’ve noticed with the display only on the 4B.

I do have the HDMI cable installed and connected to the monitor when I boot up Circle on the 4B — I know you had a note somewhere about that. My monitor is a 28.5-inch (3840 × 2160 resolution) dual HDMI monitor fro Samsung. that I share with my Mac Mini.

So the evidence I have so far seems to indicate the problem might be related to the RPi HDMI display (or USB keyboard)? But does not appear to have anything to do with your network updates, because when I send commands over the network (to my telnet server running on port 23) I don’t see any crashes.

Nonetheless, it is not totally uncommon to find problems unrelated to the changes your are testing, while going through regression tests. And I have to admit that until now, I have not done a lot of testing specific to the 4B HDMI+Keyboard, And I accept that it might be something in my code that is affecting using the HDMI+Display on a 4B … but odd that it doesn’t seem to affect older models?

Although did you read something in your documentation that the HDMI display on a 4B was a bit fragile?

Nonetheless, based on my testing and experience, I think your network changes are good to be included in the next step (49?).

BTW - Apple likes taking and storing pictures in HEIC format, which I’ve found is not universal. Therefore, I am re-attaching the screen shots as PNG files. The first one shows errors I am getting connecting to WiFi, but it seems to always connect it I wait up until too minutes, although I don’t see the errors circled in red when booting my Circle project on a 3B or Zero 2W. The second image is an example of a Synchronous execution.

Regards,

Dave.



On Oct 20, 2024, at 1:52 PM, Rene Stange @.***> wrote:

One important thing about pinging is, that after calling EnableReceiveICMP(TRUE) you have to continuously call ReceiveICMP() to remove all received ICMP packets from the queue. If you don't do this and someone is sending ICMP packets (e.g. pings) to your machine, memory will get exhausted sooner or later. Normally this will take longer, but not with a flood ping. Perhaps this can be a reason?

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2425151745, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KTYQRDMCCGQK6VTUTLZ4PUVRAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRVGE2TCNZUGU. You are receiving this because you authored the thread.

rsta2 commented 6 days ago

We should have a look into the file kernel8-rpi4.lst, in which function the exception occurs. Unfortunately I cannot see the screenshots. How it seems, you cannot send images using Email, you must login into GitHub. I had problems with WiFi on the RPi 4B also with Raspberry Pi OS in the past. Sometimes it was difficult to get a connection.

davefilip commented 6 days ago

Rene,

OK, I understand now I can’t email reply with images, so I will login to GitHub and reply there the next time that I need to send any.

To be honest, I had previously not looked at any of the {kernel-name}.* files that get auto-generated, but I did just now, and cool that I get a full list of the assembly instructions for every function listed! Next time I get a Synchronus exception, I will take a look at this. Working in Java for the past 25+ years, I understand how to read Java stack traces, but agree I need to start understanding C++ stack traces.

The only real problem I’ve had with RPi 4Bs and WiFi on Raspberry Pi OS has been when using with the Sense Hat, which causes all sorts of stability problems with WiFi, but inconsistently - sometimes works fine for hours, sometimes can’t get connected to the AP, sometimes gets connected, unstable for 5 mins and then stable for hours, the won’t stay connected again.

But again, the problem might be just TMI (Too Much Information), as I have kernel messages on Circle set to Notification when writing code, so I get a ton of information when the 4B boots up and tries to make the WiFi connection, but it always seems to connect successfully if I wait a couple of minutes. So I may have conflated that as being associated with my Synchronous exceptions, only because when I start getting crashes, I pay more attention to every console message. But it is possible that all of those WiFi console messages are benign.

Anyway, left my Circle project running overnight on the 4B + WiFi (15+ hours), receiving 4 MQTT messages a minute (4 nodes who send “Hello” messages every minute with their IP address and a rotating security token), I spent 10+ minutes yesterday sending commands to it over the network, another +/- 10 minutes this morning doing the same, and still running, still stable, so I’m feeling fairly confident about the stability of my code when not not he “console” (HDMI+Keyboard).

Next time I get a Synchronous exception on the “console”, I’ll take a look at the .list and let you know the function.

Nonetheles, I guess I should expect a “Step 49” release soon? :-)

Cheers,

Dave.

On Oct 21, 2024, at 5:28 AM, Rene Stange @.***> wrote:

We should have a look into the file kernel8-rpi4.lst, in which function the exception occurs. Unfortunately I cannot see the screenshots. How it seems, you cannot send images using Email, you must login into GitHub. I had problems with WiFi on the RPi 4B also with Raspberry Pi OS in the past. Sometimes it was difficult to get a connection.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2426117559, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KU3QTHYX6NLYOQGYT3Z4TCMBAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRWGEYTONJVHE. You are receiving this because you authored the thread.

rsta2 commented 6 days ago

Dave, didn't you say, you know, how to reproduce this synchronous exception? It would be important to sort this out. With the information from the exception I could tell you, where to look in the kernel8-rpi4.lst file.

Nonetheles, I guess I should expect a “Step 49” release soon? :-)

Circle 48 is upcoming. There a no plans for Step49 yet.

davefilip commented 6 days ago

Rene,

Dave, didn't you say, you know, how to reproduce this synchronous exception?

Yes, but I think I am not very concerned about this, since most of what I do is over the network (and headless RPis), and most of my current goals are for a handful of RPi Zero 2Ws (and not 4Bs).

So it is a matter of priorities, and right now, I am currently focused on planning a holiday in New York City for the end of the week. ;-)

Nonetheless, I also did not want to distract you from getting out a release, and feel bad that I did, so was hoping not to sidetrack you with my crashes that I know how to get around and are not currently holding me back.

So I’m assuming that this was most likely a problem I had before, and didn’t notice because I didn’t do enough testing on the 4B “console”, and not related to your new work, which again, I don’t want to hold back from you releasing.

Circle 48 is upcoming. There a no plans for Step49 yet.

Sorry, I guess I lost count, thinking 48 was current, when it is 47. You should give interesting names to each release, like Apple does (use to be big cats, like Tiger, Lion, Leopard, etc., but more recently places in California like Ventura and Monterey). And Debian uses characters from the all of the Toy Story movies (Jessie, Stretch, Buster, etc).

Nonetheless, I will try to make some time today to demonstrate the crash and check the .lst file for the function name and let you know.

Regards,

Dave.

On Oct 21, 2024, at 9:42 AM, Rene Stange @.***> wrote:

Dave, didn't you say, you know, how to reproduce this synchronous exception? It would be important to sort this out. With the information from the exception I could tell you, where to look in the kernel8-rpi4.lst file.

Nonetheles, I guess I should expect a “Step 49” release soon? :-)

Circle 48 is upcoming. There a no plans for Step49 yet.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2426723783, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KREV73PFZOGGUWFUULZ4UAEZAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRWG4ZDGNZYGM. You are receiving this because you authored the thread.

davefilip commented 6 days ago

Fortunately (?!) I was able to switch to the “console” (HDMI+Keyboard) and type in 3 commands to get it to crash (after running 15+ hours without touching the keyboard).

I have opened a new thread, and provided a screen shot of the crash (in PNG) and the kernel .lst file.

So when looking at the stack trace, I assume I am looking at the last program counter address? Or the first?

On Oct 21, 2024, at 9:58 AM, David Filip @.***> wrote:

Rene,

Dave, didn't you say, you know, how to reproduce this synchronous exception?

Yes, but I think I am not very concerned about this, since most of what I do is over the network (and headless RPis), and most of my current goals are for a handful of RPi Zero 2Ws (and not 4Bs).

So it is a matter of priorities, and right now, I am currently focused on planning a holiday in New York City for the end of the week. ;-)

Nonetheless, I also did not want to distract you from getting out a release, and feel bad that I did, so was hoping not to sidetrack you with my crashes that I know how to get around and are not currently holding me back.

So I’m assuming that this was most likely a problem I had before, and didn’t notice because I didn’t do enough testing on the 4B “console”, and not related to your new work, which again, I don’t want to hold back from you releasing.

Circle 48 is upcoming. There a no plans for Step49 yet.

Sorry, I guess I lost count, thinking 48 was current, when it is 47. You should give interesting names to each release, like Apple does (use to be big cats, like Tiger, Lion, Leopard, etc., but more recently places in California like Ventura and Monterey). And Debian uses characters from the all of the Toy Story movies (Jessie, Stretch, Buster, etc).

Nonetheless, I will try to make some time today to demonstrate the crash and check the .lst file for the function name and let you know.

Regards,

Dave.

On Oct 21, 2024, at 9:42 AM, Rene Stange @.***> wrote:

Dave, didn't you say, you know, how to reproduce this synchronous exception? It would be important to sort this out. With the information from the exception I could tell you, where to look in the kernel8-rpi4.lst file.

Nonetheles, I guess I should expect a “Step 49” release soon? :-)

Circle 48 is upcoming. There a no plans for Step49 yet.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2426723783, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KREV73PFZOGGUWFUULZ4UAEZAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRWG4ZDGNZYGM. You are receiving this because you authored the thread.

rsta2 commented 6 days ago

Sorry, I guess I lost count, thinking 48 was current, when it is 47. You should give interesting names to each release, like Apple does (use to be big cats, like Tiger, Lion, Leopard, etc., but more recently places in California like Ventura and Monterey). And Debian uses characters from the all of the Toy Story movies (Jessie, Stretch, Buster, etc).

No problem. Perhaps I was not creative enough to create sophisticated names for about 50 releases. I think, I will stay with the numbers. ;)

davefilip commented 6 days ago

You could start using Star Wars names … after Disney bought the franchise from George Lucas, there are at last 50 sequels / prequels / spin-offs! ;-)

[I have family members who are really into the whole Star Wars franchise … and got really excited when there was a baby Yoda … me, I just liked the first 3 movies from the 1970’s, and not anything that came after.]

On Oct 21, 2024, at 11:59 AM, Rene Stange @.***> wrote:

Sorry, I guess I lost count, thinking 48 was current, when it is 47. You should give interesting names to each release, like Apple does (use to be big cats, like Tiger, Lion, Leopard, etc., but more recently places in California like Ventura and Monterey). And Debian uses characters from the all of the Toy Story movies (Jessie, Stretch, Buster, etc).

No problem. Perhaps I was not creative enough to create sophisticated names for about 50 releases. I think, I will stay with the numbers. ;)

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/488#issuecomment-2427089816, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KW6UI7V46GFZOFNHOTZ4UQEPAVCNFSM6AAAAABP3YEHUWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRXGA4DSOBRGY. You are receiving this because you authored the thread.