rsta2 / circle

A C++ bare metal environment for Raspberry Pi with USB (32 and 64 bit)
https://circle-rpi.readthedocs.io
GNU General Public License v3.0
1.87k stars 249 forks source link

Crash on RPi 4B + WiFi on Console #490

Open davefilip opened 1 month ago

davefilip commented 1 month ago

Don't see this crash with same build on 1B+Ethernet, 2B+Ethernet, 3B+Ethernet, 3B+WiFi, or 4B+Ethernet, but can reproduce fairly easily on 4B+WiFi.

It could be that it is just more easily reproducible on 4B+WiFi do to timing issues, as I usually don't test more than +/- 10 mins on the other hardware configurations, and can usually reproduce within 1 - 2 mins on 4B+WiFi.

I should note that when it does crash, it is not consistently on the same command, but usually a command that touches the network (in this example getting an HTTP URL, but sometimes a ping, sometimes executing a command on another Circle node, all of which are commands that even on 4B+WiFi "console" will work one more times, but then eventually cause a crash).

If I stay off the console and send commands over the network (using a telnet server I've written), I have left it running > 15 hours and sent dozens of similar commands over the network (telnet) and not had a crash.

davefilip commented 1 month ago

CircleCrash

davefilip commented 1 month ago

kernel8-rpi4.lst.zip

rsta2 commented 1 month ago

Thanks! I will analyze this and come back.

davefilip commented 1 month ago

Thanks Rene! I see it looks like it is in the USB keyboard … which I thought was the same across models? Let me know if you want me to send you my USB keyboard loop, as it is possible I am doing something wrong. Unless the exception is starting further down in the USB stack?

On Oct 21, 2024, at 10:48 AM, Rene Stange @.***> wrote:

Thanks! I will analyze this and come back.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2426907322, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KXAPAEVEIAARCXWAHLZ4UH4LAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRWHEYDOMZSGI. You are receiving this because you authored the thread.

rsta2 commented 1 month ago

Yes, it happens down in the interrupt handler of the xHCI device driver. This is the call stack (bottom-up):

000000000009ac10 <CUSBFunction::GetHost() const>: 000000000009e098 <CUSBHIDDevice::StartRequest()>: 000000000009e188 <CUSBHIDDevice::CompletionRoutine(CUSBRequest*)>: 00000000000959a4 <CXHCIEventManager::HandleEvents()>: 00000000000938b0 <CXHCIDevice::InterruptHandler()>:

The this-Pointer, when calling CUSBFunction::GetHost() is wrong, and when it is referenced, the exception occurs, unfortunately with unknown reason (EC=0). I do not understand yet, why this happens. Have to think about it.

rsta2 commented 1 month ago

Dave, this is more complicated, that I was hoping and I cannot solve it quickly. Maybe we need to add some debug logging code to your program and see, what happens. Or I try to reproduce this here with WiFi and a keyboard attached. But this will take some time and you are about to go on a trip. Thank you so far and have a nice trip!

Rene

davefilip commented 1 month ago

Rene,

Thanks for the feedback, and yes, I’ve read about the sensitivity / stability of the RPi USB hub.

Although is the USB hub different between the 4B and earlier models?

To be honest, USB is what brought me to Circle, since as I said previously, I was working on taking bits and pieces of other “bare metal” Raspberry Pi example operating systems and combining them, but never got very far with anything that worked across all models (and none of them really did USB correctly). I believe someone ported the Circle USB driver from C++ to C, which I started taking a look at, and then decided “Hey, Circle already has drivers working for all of the hardware I need working across all the models, so maybe I should just learn C++”, which I did. Sort of, still figuring some of it out. Like when I was asking questions about sharing data and pointers across tasks, I didn’t understand C++ static functions / variables, as the syntax and use is a bit different than Java, although I get it now.

But in summary, even if there could be a bug somewhere, you’ve done much better than anything else I found for bare metal USB on a RPi!

Regards,

Dave.

On Oct 21, 2024, at 11:27 AM, Rene Stange @.***> wrote:

Yes, it happens down in the interrupt handler of the xHCI device driver. This is the call stack (bottom-up):

000000000009ac10 <CUSBFunction::GetHost() const>: 000000000009e098 <CUSBHIDDevice::StartRequest()>: 000000000009e188 <CUSBHIDDevice::CompletionRoutine(CUSBRequest*)>: 00000000000959a4 <CXHCIEventManager::HandleEvents()>: 00000000000938b0 <CXHCIDevice::InterruptHandler()>:

The this-Pointer, when calling CUSBFunction::GetHost() is wrong, and when it is referenced, the exception occurs, unfortunately with unknown reason (EC=0). I do not understand yet, why this happens. Have to think about it.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2427011166, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KU5QHJO5CYDZUJVTS3Z4UMO7AVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRXGAYTCMJWGY. You are receiving this because you authored the thread.

rsta2 commented 1 month ago

The RPi 4 has a USB 3.0 xHCI USB host controller. USB 1.x and 2.0 devices are connected via an integrated USB hub to one of the USB root ports. RPi 1-3 has a DesignWare OTG USB 2.0 host controller. Some RPi models have an external (but on-board) USB hub.

Thanks for using Circle! I hope, we can sort this issue out soon.

Rene

davefilip commented 1 month ago

No worries, as I said, not currently on the critical path for what I hope to accomplish in the near future!

Besides, as an IT contractor (now retired), I learned that nobody wanted to pay my hourly rate for perfection, but they were willing to pay my hourly rate to solve the business problem before them. So the key was a balance between work that I felt good enough about for the minimum number of hours to reliably solve their business problem, but nothing beyond.

Now that I am retired … and the water purification project — which will be the first business use of this project — is going slower than I had expected … I can now spend more time doing what I want, but still focusing specifically on what I need over the next 6+ months.

For example, I currently have absolutely no interest in the RPi 5. To be honest, I think the RPi Foundation jumped the shark on that one, because for me, the idea of a single board computer like a RPi is something that is cheap, low power, low heat, and solidly reliable. Not adding more and more powerful more cost so that it basically becomes a desktop computer. To me it is ridiculous that the RPi 5 now requires 3000 ma, and throws off so much heat that there are now liquid cooled heat sinks for it!

Otherwise, if I want a small footprint desktop, I can spend around US$100 for a 3 inch square 4 GHz Intel computer with 16 GB RAM, 512 GB SSD, multiple HDMI and USB 3 ports, and throw Linux on it! [which, actually, I did a few years ago]

On Oct 21, 2024, at 11:48 AM, Rene Stange @.***> wrote:

Dave, this is more complicated, that I was hoping and I cannot solve it quickly. Maybe we need to add some debug logging code to your program and see, what happens. Or I try to reproduce this here with WiFi and a keyboard attached. But this will take some time and you are about to go on a trip. Thank you so far and have a nice trip!

Rene

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2427065204, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KUE57S27SOGJ3TT72LZ4UO5RAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRXGA3DKMRQGQ. You are receiving this because you authored the thread.

davefilip commented 1 month ago

Ah got it! OK, sorry for not being up-to-date on the differences. I knew the speed on 2x of the 4B ports were faster (USB 3 vs 2), but didn’t understand the architectural changes.

Which is one of the reasons I am building on Circle, because I don’t have to know the differences! ;-)

Some RPi models have an external (but on-board) USB hub.

That I had no idea of! Perhaps one of the differences between some models with and without the plus (+)?

BTW - I have more than a dozen RPis running 24x7 (running a couple different versions of the Raspberry Pi Operating System), and all but a few are headless (no keyboard or display). Of the few that have a display (7” HDMI), they have a mouse but no keyboard. When I do use a keyboard on a RPi, I usually use a VNC console over the network, in a window on my Mac. Gist is I am (personally) not big into typing on RPi USB keyboards. I am doing so on this Circle project mostly just for regression testing between significant code changes.

On Oct 21, 2024, at 12:19 PM, Rene Stange @.***> wrote:

The RPi 4 has a USB 3.0 xHCI USB host controller. USB 1.x and 2.0 devices are connected via an integrated USB hub to one of the USB root ports. RPi 1-3 has a DesignWare OTG USB 2.0 host controller. Some RPi models have an external (but on-board) USB hub.

Thanks for using Circle! I hope, we can sort this issue out soon.

Rene

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2427137214, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KSZJ2WT5GOJ22HYFIDZ4USRZAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRXGEZTOMRRGQ. You are receiving this because you authored the thread.

rsta2 commented 1 month ago

Dave, from the synchronous exception info and your kernel8-rpi4.lst I wasn't able to find the reason for the issue so far, it may be a side effect. I modified the sample mbedtls/06-webclient in the circle-stdlib project by a simple command interpreter, which reads a command (currently only "list" or "reboot") from an USB keyboard and displays the RPi revision list or reboots.

With this I found a problem in CMachineInfo, which can be fixed with this patch, but I'm afraid that's not the reason for the issue, you have reported:

diff --git a/lib/machineinfo.cpp b/lib/machineinfo.cpp
index 744a150f..66fbe8ec 100644
--- a/lib/machineinfo.cpp
+++ b/lib/machineinfo.cpp
@@ -308,13 +308,13 @@ CMachineInfo::~CMachineInfo (void)
 {
        m_MachineModel = MachineModelUnknown;

+       if (s_pThis == this)
+       {
 #if RASPPI >= 4
-       delete m_pDTB;
-       m_pDTB = 0;
+               delete m_pDTB;
+               m_pDTB = 0;
 #endif

-       if (s_pThis == this)
-       {
                s_pThis = 0;
        }
 }

But perhaps you will try your program with this patch. With it I can enter "list" several times without a problem.


For example, I currently have absolutely no interest in the RPi 5. To be honest, I think the RPi Foundation jumped the shark on that one, because for me, the idea of a single board computer like a RPi is something that is cheap, low power, low heat, and solidly reliable. Not adding more and more powerful more cost so that it basically becomes a desktop computer. To me it is ridiculous that the RPi 5 now requires 3000 ma, and throws off so much heat that there are now liquid cooled heat sinks for it!

I think, the RPi 5 is good for very CPU time consuming applications. It has much more CPU power than the RPi 4. So for example audio synthesizers or AI apps can profit from it. But yes, for many other apps the older RPi models should be sufficient. I have have a small heat sink on my RPi 5, but usually use it without a fan. I don't think, that it draws 3A all the time. This is required, when you have some USB devices connected.

davefilip commented 1 month ago

Rene,

Actually, one of the commands that I always try when testing is “show computer all”, which does use CMachineInfo to display the RPi model (as well as how much memory is configured, but it also lists the IP address, router IP, and DNS IP, as well as a list of running tasks, and how much memory is available). Again, I use it to hit a few different subsystems during regression testing. So machineinfo.cpp is a bit of code I am hitting during my tests, and potentially could be a side effect.

I see that machineinfo.cpp was updated about 2 months ago in the develop branch. So is this diff from Step 47 master, or develop? Or if I wait until ( … tomorrow?), can I just copy in machineinfo.cpp from the develop branch, assuming it will have this patched there?

Just don’t want to apply a patch from the wrong version of machineinfo.cpp (not sure if that would get flagged or appear to work)?).

I am currently running Step 47, patched with the files you have updated for MQTT client, ICMP / ping client, and network loopback, as well as suppressing non-critical spurious USB warnings. So Step 47 with the following from the develop branch:

circle-develop/test/ping-client/* circle-develop/include/circle/net/netsubsystem.h circle-develop/include/circle/net/networklayer.h circle-develop/lib/net/mqttclient.cpp circle-develop/lib/net/linklayer.cpp circle-develop/lib/net/netsubsystem.cpp circle-develop/lib/net/networklayer.cpp circle-develop/lib/usb/dwhcidevice.cpp

Cheers,

Dave.

On Oct 21, 2024, at 5:47 PM, Rene Stange @.***> wrote:

Dave, from the synchronous exception info and your kernel8-rpi4.lst I wasn't able to find the reason for the issue so far, it may be a side effect. I modified the sample mbedtls/06-webclient in the circle-stdlib project by a simple command interpreter, which reads a command (currently only "list" or "reboot") from an USB keyboard and displays the RPi revision list or reboots.

With this I found a problem in CMachineInfo, which can be fixed with this patch, but I'm afraid that's not the reason for the issue, you have reported:

diff --git a/lib/machineinfo.cpp b/lib/machineinfo.cpp index 744a150f..66fbe8ec 100644 --- a/lib/machineinfo.cpp +++ b/lib/machineinfo.cpp @@ -308,13 +308,13 @@ CMachineInfo::~CMachineInfo (void) { m_MachineModel = MachineModelUnknown;

  • if (s_pThis == this)

  • {

    if RASPPI >= 4

  • delete m_pDTB;

  • m_pDTB = 0;

  • delete m_pDTB;

  • m_pDTB = 0;

    endif

  • if (s_pThis == this)

  • { s_pThis = 0; } } But perhaps you will try your program with this patch. With it I can enter "list" several times without a problem.

For example, I currently have absolutely no interest in the RPi 5. To be honest, I think the RPi Foundation jumped the shark on that one, because for me, the idea of a single board computer like a RPi is something that is cheap, low power, low heat, and solidly reliable. Not adding more and more powerful more cost so that it basically becomes a desktop computer. To me it is ridiculous that the RPi 5 now requires 3000 ma, and throws off so much heat that there are now liquid cooled heat sinks for it!

I think, the RPi 5 is good for very CPU time consuming applications. It has much more CPU power than the RPi 4. So for example audio synthesizers or AI apps can profit from it. But yes, for many other apps the older RPi models should be sufficient. I have have a small heat sink on my RPi 5, but usually use it without a fan. I don't think, that it draws 3A all the time. This is required, when you have some USB devices connected.

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2427802317, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KTHLINCWLWI56GDFF3Z4VY6DAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRXHAYDEMZRG4. You are receiving this because you authored the thread.

rsta2 commented 1 month ago

Dave, the problem with CMachineInfo came in with the commit 8151b905 from Aug 25 2024, so in Circle 47 this problem does not exist and if you did not modify lib/machineinfo.cpp, the patch will not help.

davefilip commented 1 month ago

Great, thanks for clarifying.  So I don’t need this patch since I am using that file from Step 47 master branch.Sent from my iPhoneOn Oct 22, 2024, at 5:29 AM, Rene Stange @.***> wrote: Dave, the problem with CMachineInfo came in with the commit 8151b905 from Aug 25 2024, so in Circle 47 this problem does not exist and if you did not modify lib/machineinfo.cpp, the patch will not help.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>

rsta2 commented 1 month ago

Dave, so what to do? When there were that difficult (possibly side effect) problems in the past, it was possible to reproduce the issue in a small program, which I was able to debug here locally. We did this years ago, when there were a problem with the spin lock (EnterCritical()) implementation, also difficult to find. How it seems, this is not possible in our case. I'm afraid the reason cannot be found without JTAG debugging.

Would it be possible, that you send your kernel8-rpi4.exe file to me (e.g. per Email, my address is on GitHub), so that I can debug it. I need the .exe file for GDB, not the .img file. I don't know, if it is difficult to configure, so that I am able to enter some commands to trigger the exception? I think I can debug this without the sources. At the moment I do not have an other idea.

Thanks,

Rene

davefilip commented 1 month ago

Rene,

Sure, that sounds like a plan. Give me some time to test / package, since I’m still developing, and don’t have any documentation (other than in the code itself), and it expects a few files on the SDHC card (which I haven’t had the time to test without).

I also have no problem sharing the source with you, if it would help. My eventual plan is to open source the project, once I have enough of what I think could be useful to someone else.

Today I’m spending the day in meetings with a non-profit that is providing resources to the water purification project I’m working on.

Tomorrow I am finalizing our holiday plans (spending a 4-day weekend in New York City, and trying to cram as much as I can in … my wife is the type of person who doesn’t do well with figuring things out once you get there).

Friday morning we’re traveling, so unless I can find enough time on Thursday, I probably won’t get back to you until early next week.

FWIW - I am less concerned about this short term because right now I am using RPi 3Bs on the water purification project, and was planning on adding a couple of Zero 2Ws for additional sensors. I have not seen any crashes on those models.

But while I do understand that you want to make Circle as resilient and bug free as possible, it is possible that the rot cause is somewhere in my code, which could be corrupting an internal data structure somewhere. Not that I know where, but part of the fun in building an OS with non-protected memory is that a bug in one place can cause a crash in another. But you know that well.

I’ll get you something to test as soon as I can, but again, might not be until sometime next week.

Regards,

Dave.

On Oct 22, 2024, at 7:22 AM, Rene Stange @.***> wrote:

Dave, so what to do? When there were that difficult (possibly side effect) problems in the past, it was possible to reproduce the issue in a small program, which I was able to debug here locally. We did this years ago, when there were a problem with the spin lock (EnterCritical()) implementation, also difficult to find. How it seems, this is not possible in our case. I'm afraid the reason cannot be found without JTAG debugging.

Would it be possible, that you send your kernel8-rpi4.exe file to me (e.g. per Email, my address is on GitHub), so that I can debug it. I need the .exe file for GDB, not the .img file. I don't know, if it is difficult to configure, so that I am able to enter some commands to trigger the exception? I think I can debug this without the sources. At the moment I do not have an other idea.

Thanks,

Rene

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2429013669, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KX3HIRZFXU72MRVY4TZ4YYQHAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRZGAYTGNRWHE. You are receiving this because you authored the thread.

rsta2 commented 1 month ago

Dave, these are good prospects, thanks. Take the time, you need. Now it's also not a problem, if the new release is delayed one or two weeks. There is no schedule for this anyway.

Yes, sometimes it can be a problem, that there is no memory protection.

BTW. I mean the .elf file of course, not the .exe.

Rene

rsta2 commented 4 weeks ago

Dave, I hope you enjoyed your trip to NYC and you are back again in the meantime. We spoke about the possibility to JTAG debug your application here on my machine to be able to solve this issue. I want to ask for your status regarding this. I would need your kernel8-rpi4.elf file and maybe a few (?) configuration files.

Thanks,

Rene

davefilip commented 4 weeks ago

Rene,

Yes, we are back, and had a great time in NYC.

Before I left for NYC, I updated my code base to automatically create a couple of configuration files if they were not present on the SDHC card.

Yesterday and today I started creating some basic documentation. This, plus the config file auto-create, are things I would have normally done at the end of the project, but probably not a bad idea to get started now, before the project gets too far along and more complex.

I have some meetings to go to tomorrow, but I plan on working over the weekend to get the documentation good enough to give to you by Monday (maybe late my Monday, which means you might not get until your Tuesday morning).

My original plan was to strip everything down before I gave it to you — by changing one header file, I can create a minimal boot kernel with no services — but then I thought that would probably not be a good test, since stripping it down may affect whatever the problem might be.

So rather than sending you a long email with an informal brain dump and possibly forgetting some bits, I created a basic web site where I can start building the basic documentation. Not complete by Monday, and still a work in progress, but filled out with the bits that I think are relevant.

Therefore, I have not forgotten about this, and will try to get you something no later than COB Monday my time (EST), if that is acceptable. At least that was my plan.

Regards,

Dave.

On Oct 31, 2024, at 12:19 PM, Rene Stange @.***> wrote:

Dave, I hope you enjoyed your trip to NYC and you are back again in the meantime. We spoke about the possibility to JTAG debug your application here on my machine to be able to solve this issue. I want to ask for your status regarding this. I would need your kernel8-rpi4.elf file and maybe a few (?) configuration files.

Thanks,

Rene

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2450290845, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KQ6CBM7NFASCJTN7FDZ6JKCPAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJQGI4TAOBUGU. You are receiving this because you authored the thread.

rsta2 commented 4 weeks ago

Dave, this sounds like a good plan! Thanks for the report and for working on this.

Rene

davefilip commented 3 weeks ago

Rene,

Just a quick update, that I might not get you something to test until Tuesday. However, I don’t anticipate it going any later than Tuesday.

This weekend I got distracted with — again, something that I was planning on addressing towards the end of the project — doing a clean build of colorOS (my operating system) that is not convoluted.

Due partly due to my confusion, and the differences between Circle as-is and Stephan’s stdlib/newlib as-is, I had a convoluted build process that required downloading and building GitHub Circle and stdlib/newlib with embedded Circle separately — for each RPi model -- and building GitHub Circle twice (both with and without STDLIB_SUPPORT = 3, to get around problems with some of the /addon/ code when STDLIB_SUPPORT = 3).

Needless to say, it was a mess, so I now have a build process that works with the Circle built into stdlib/newlib, so the process is now a lot cleaner (just one download and build per RPi model).

As I was writing the documentation, I though about what if you (or someone else further down) wanted to build my project for testing purposes, and I was loath to document my original convoluted build process.

So I am back to documenting / testing / packaging something for you, which you might not get until Tuesday. But if your or someone else ever does need to do a test build of my project, it is now a lot easier (and hopefully doesn’t leave the person scratching their head in confusion).

Regards,

Dave.

On Oct 31, 2024, at 4:33 PM, Rene Stange @.***> wrote:

Dave, this sounds like a good plan! Thanks for the report and for working on
this.

Rene


— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2450765640, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KXBFASRPYS7XP52CVTZ6KHYTAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJQG43DKNRUGA. You are receiving this because you authored the thread.

rsta2 commented 3 weeks ago

Dave, thanks for the update and for making the build process cleaner. I'm looking forward to get access to your program and I hope, I can find the reason for the synchronous exception.

Rene

davefilip commented 3 weeks ago

Rene,

How can I get you files? I think there are restrictions / limits in terms of what can be sent via this channel?

You have my e-mail address @. @.>), let me know if there is an out-of-band way to send you files?

Thanks,

Dave.

On Nov 4, 2024, at 5:31 AM, Rene Stange @.***> wrote:

Dave, thanks for the update and for making the build process cleaner. I'm looking forward to get access to your program and I hope, I can find the reason for the synchronous exception.

Rene

— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2454346416, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KW35EAATTSGA4MBPKTZ65EI5AVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJUGM2DMNBRGY. You are receiving this because you authored the thread.

rsta2 commented 3 weeks ago

Dave, unfortunately your e-mail address was removed by GitHub, but you can send me an e-mail. My address is in the header of most Circle source files, or on my GitHub front page.

Thanks,

Rene

davefilip commented 3 weeks ago

Rene,

Apologies - yes your email address is in all of your source files - which I have probably seen a million times but never noticed! I blame the filter in my brain that automatically filters out legal stuff and focuses only on the code! ;-)

I will send you some files today. To be honest, I have no background in JTAG debugging, but I think you said you needed the kernel8-rpi4.elf file? Do you also need the related .lst or .map files? Or just the .elf?

Can I also send you the RPi3 (kernel8.*) file(s) as well, for comparison, or does that not help? I’m thinking since it seems to work better on a RPi3?

Let me know and I’ll email you files later today. As I said previously, although we know the crash is happening in the keyboard USB driver, we don’t know whether the bug is in Circle core itself, or something that my code is doing to corrupt a data structure in Circle core. Therefore I am sending you an image built with all services enabled, so that hopefully you can replicate the problem and let me know if it is something that I am doing to corrupt Circle.

Cheers,

Dave

On Nov 5, 2024, at 6:34 AM, Rene Stange @.***> wrote:

Dave, unfortunately your e-mail address was removed by GitHub, but you can
send me an e-mail. My address is in the header of most Circle source files, or
on my GitHub front page.

Thanks,

Rene



— Reply to this email directly, view it on GitHub https://github.com/rsta2/circle/issues/490#issuecomment-2456935431, or unsubscribe https://github.com/notifications/unsubscribe-auth/BA2V5KRLIVP62FKMRXMU62LZ7CUODAVCNFSM6AAAAABQKJEXGWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINJWHEZTKNBTGE. You are receiving this because you authored the thread.

rsta2 commented 3 weeks ago

Dave, no problem, focus is essential! So that's OK.

Yes, I need just the kernel8-rpi4.elf file. I hope, that's enough, I will come back on you, if not. ;) I think the RPi 3 kernel image is not necessary at the moment. My plan for the beginning is to set a breakpoint on the exception handler, and when it is hit, I can examine some data structures, if something was overwritten.

Thanks,

Rene

rsta2 commented 3 weeks ago

For the record: We were not able to solve this issue so far. While @davefilip can reproduce the exception, I was not able to do this on my local RPi 4B. Therefore I'm also not able to JTAG debug the issue to find the reason.