mcci-catena / Arduino_Core_STM32

STM32 core support for Arduino
2 stars 7 forks source link

Catena4612 Won't Run Until Connected to PC #189

Closed mhmayyan closed 2 years ago

mhmayyan commented 2 years ago

When I select USB or USB+Hardware Serial for the serial interface and I power the board externally via Vbus pin, some Catena4612 boards won't run the sketch until I connect USB cable to a computer.

Is there a way to set the board in such a way that it runs my code and enables the USB serial regardless of Vbus voltage? Why does it wait for me to connect a USB cable? I need to supply power through Vbus pin but the board thinks I am connected to a computer and suspends running the sketch until I connect it to a computer and sometimes I have to open a Serial monitor.

mhmayyan commented 2 years ago

I think the problem is in the Catena_Arduino_Platform library.

I found this

    // the setup function. The enabler function is used for streams like
    // USB Serial, which hang if you talk to them when they're not plugged
    // in...
    void begin(
        Stream *pStream,            // the stream to poll.
        cStreamReady *pReady = nullptr      // optional enabler.
        );
dhineshkumarmcci commented 2 years ago

hi @mhmayyan I believe this is because the operatingflag being not set. Can you please verify it using the command system configure operatingflags and check if it is set to 1 OR 0.

If it is set to 0, we suggest you to set it to 1 using below comant, and that helps the board to run on power-up.

mhmayyan commented 2 years ago

Yes, it is set to 1. I still have the same problem. My sketch does not use the operating flags.

dhineshkumarmcci commented 2 years ago

Thank you for letting us know, can you please let us know about the power provided to Vbus pin and also the power source?

Whether PWR (Green) LED is ON while powering from VBUS pin?

mhmayyan commented 2 years ago

It is a 5V power supply. The green LED is on. The board works When I disable the USB Serial but sometimes I need it for debugging and to access some data while the system is running.

What makes it weird is that this problem does not show up on all Catena 4612 boards even when using the very same firmware.

I think this problem belongs to the library Catena Arduino Platform. The library monitors the voltage level at Vbus pin to tell if the USB is connected to a computer. If so, it blocks until I connect the board to a computer and the USB connection establishes.

dhineshkumarmcci commented 2 years ago

hi @mhmayyan we would like to know what sketch is being used at your end. we would also like to know if you are checking USBCON in your sketch and waiting for Serial to enumerate.

can you please confirm if your sketch has similar lines:

ifdef USBCON

    // if running unattended, don't wait for USB connect.
    while (!Serial)
            /* wait for USB attach */
            yield();
    }

endif

if it is being added, can you please remove them and verify if it works.

terrillmoore commented 2 years ago

hi @mhmayyan -- you said the library monitors the voltage -- please indicate the location in the library? Thanks, --Terry

mhmayyan commented 2 years ago

Hi @terrillmoore. Sorry, I was confused. I think, the library just overrides the function USBD_LLConnectionState. The library is awesome ^^. Thanks.

@dhineshkumarmcci I tested the code below to investigate the problem with the four Serial interface settings selectable in Arduino Tools menu. The two options that have USB (i.e., "USB + HW Serial" or "USB Serial") cause the sketch to block at Serial.begin(115200) while the USB cable is disconnected and the board is powered externally via Vbus with 5V. I tested multiple Catena 4612 boards. Only some of them get stuck but not all of them. They all, however, pass the function Serial.begin(115200) once I reconnect the USB cable. Weird!

void flashFast(int halfPeriod)
{
  bool state=1;
  for (size_t rpt = 0; rpt < 10; rpt++) {
      digitalWrite(D13, state^=1);
      delay(halfPeriod);
  }
  digitalWrite(D13, 0);
  delay(1000);
}

void setup() {
  pinMode(D13,OUTPUT);
  /*************************************/
  flashFast(50);
  Serial.begin(115200); /*<<< It gets stuck here*/
  flashFast(100);
  Serial.println("Hello!");
  flashFast(150);
}
void loop() {
  flashFast(1000);
}
terrillmoore commented 2 years ago

Aha. We may owe you big time. This is probably the root cause of a problem we've been having in general with 4612s not recovering from sleep. Let me think about the best way to use this to diagnose this problem remotely.

dhineshkumarmcci commented 2 years ago

@mhmayyan thank you for letting us know on this issue, we will let you know about the required software changes for you to work.

dhineshkumarmcci commented 2 years ago

hi @mhmayyan we have updated the API USBD_LL_ConnectionState() in the library to return true only if the vBus value is greater than 4.35V instead of 3.0V. I have pushed the changes to the branch vbus-check-issue in the forked repo:

Can you please use the library with these changes and let us know.

mhmayyan commented 2 years ago

Thank you @dhineshkumarmcci for the help.

I tested the modifications you made. I downloaded that fork and compiled my code using the forked library but it did not solve the problem. Please note that the input voltage I am providing via the pin Vbus is 5V, not even 4.95V.

What we need is that the USB connection should not wait/block until an actual USB cable is connected to the board. We deploy these boards into forests and rivers and we supply them with 5V. It is really annoying when they stop reporting just because they want an actual USB connection to establish.

I don't understand how USB works but I think the MCU can just transmit data when Serial.print() is used and should never care whether or not there is an actual USB connection to a PC.

terrillmoore commented 2 years ago

I agree @mhmayyan; we were just trying to see if the problem was due to the analog input. What's odd is that it should happen on all your boards, not just some of them. Thanks for testing. (On your boards that are not working, the code is the same that the boards that are working. So there's some hardware difference.)

mhmayyan commented 2 years ago

Thanks @terrillmoore

Yes, I agree, it is really weird that this problem is showing on some boards but not on all of them. Even on the boards that have the problem, they occasionally work. And yes, they are all the same model, i. e., Catena 4612. Even among boards from the same production batch that are supposed to have identical parts and were manufactured in the same time.

terrillmoore commented 2 years ago

@mhmayyan: Can you send me one of the boards that malfunctions reliably, along with repro instructions? I'd like to see if I can figure out what's going on, and it's not obvious.

terrillmoore commented 2 years ago

@mhmayyan Alternately, I can send you a STLINK, and we can do remote debug sessions with Zoom. You'd have to solder some pins on the board, but it's pretty easy -- as long as you're not using D6 and D9 for GPIOs already.

mhmayyan commented 2 years ago

Catena 4612 boards are now like gold. We don't give away our treasures ^_^. Unless if you want to trade with replacements.

I will reach out to you soon.

Thank you @terrillmoore

mhmayyan commented 2 years ago

@terrillmoore I just checked my tools and found an ST-LINK V2. Please let me know what I need to prepare for debugging, like if I need to install a software or any other hardware preparations.

terrillmoore commented 2 years ago

Oh, I just realized something. Can you please use the command system configure operatingflags to see the flag values on the "good" and "bad" boards? Thanks. Let's make sure that bit 0 is set in both cases.

terrillmoore commented 2 years ago

For ST-LINK setup, you can refer to this document: HOWTO-DEBUG-WITH-GDB.md. The clip-leads shown in the document definitely work, but it's a lot easier to solder pins onto the Catena and use jumpers like the Adafruit F-F jumpers.

mhmayyan commented 2 years ago

The flag values are 00000001 on both good and bad boards. BTW, we don't use these flags in our deployments.

To the best of my understanding of the issue is that it is in the Arduino core and the system drivers not in Catena Arduino Platform library.

I was able to debug but so far I am unable to make a final conclusion. However, I can say that it gets stuck at the following lines in file stm32/3.0.4/cores/arduino/stm32/usb_serial.cpp


  if (USBD_Init(&hUSBD_Device_CDC, &CDC_Desc, DEVICE_FS) == USBD_OK)
  {

    /* Add Supported Class */
    if (USBD_RegisterClass(&hUSBD_Device_CDC, USBD_CDC_CLASS) == USBD_OK)
    {

      /* Add CDC Interface Class */
      if (USBD_CDC_RegisterInterface(&hUSBD_Device_CDC, &USBD_Interface_fops_FS) == USBD_OK)
      {

        /* Start Device Process */
        USBD_Start(&hUSBD_Device_CDC);
        m_Started = true;
      }
    }
  }

I followed the function USBD_Init(&hUSBD_Device_CDC, &CDC_Desc, DEVICE_FS) and it got stuck in the each of the following lines in file stm32/3.0.4/cores/arduino/stm32/USB/usbd_conf.c:

  HAL_PCDEx_PMAConfig(&g_hpcd, 0x00, PCD_SNG_BUF, 0x10);
  HAL_PCDEx_PMAConfig(&g_hpcd, 0x80, PCD_SNG_BUF, 0x50);

//  HAL_PCDEx_PMAConfig(&g_hpcd, 0x01, PCD_DBL_BUF, 0x01400100);
//  HAL_PCDEx_PMAConfig(&g_hpcd, 0x81, PCD_DBL_BUF, 0x01C00180);
  HAL_PCDEx_PMAConfig(&g_hpcd, 0x01, PCD_SNG_BUF, 0x0100);
  HAL_PCDEx_PMAConfig(&g_hpcd, 0x81, PCD_SNG_BUF, 0x0180);

  HAL_PCDEx_PMAConfig(&g_hpcd, 0x02, PCD_SNG_BUF, 0x0200);
  HAL_PCDEx_PMAConfig(&g_hpcd, 0x82, PCD_SNG_BUF, 0x0280);
terrillmoore commented 2 years ago

If it's getting stuck, it's not a loop; there are no loops in any of this code (or in the code called) [as far as I can see].

When you say "it gets stuck", can you be more specific?

mhmayyan commented 2 years ago

So, what I mean is that when I use the gdb Command "s" at the line HAL_PCDEx_PMAConfig(&g_hpcd, 0x00, PCD_SNG_BUF, 0x10); for example, the gdb won't show anything and even the MCU does nothing until I hit ctrl+c after which gdb shows that the MCU is runing some other code in a different file. That code usually is an interrupt routine such as USB_IRQHandler or HAL_PCD_IRQHandler.

If I just connect the USB cable and disconnect it the MCU runs all those lines smoothly without getting stuck.

mhmayyan commented 2 years ago

I was running the following code while I was debugging. This code does not use the library Catena Arduino Platform, and that is why I believe the issue is in the Arduino core or in the system drivers.

void flashFast(int halfPeriod) { bool state=1; for (size_t rpt = 0; rpt < 10; rpt++) { digitalWrite(D13, state^=1); delay(halfPeriod); } digitalWrite(D13, 0); delay(1000); }

void setup() { pinMode(D13,OUTPUT); /*****/ flashFast(50); Serial.begin(115200); /<<< It gets stuck here/ flashFast(100); Serial.println("Hello!"); flashFast(150); } void loop() { flashFast(1000); }

terrillmoore commented 2 years ago

Ah! This means it's a hot interrupt. (Single step in GDB can "get away" if an interrupt occurs.)

Can you obtain a stack backtrace (gdb info stack) when you break in after s gets away from you?

mhmayyan commented 2 years ago

Do you mean right after I hit ctrl+c that I use it to get out after it gets stuck due to the single step in GDB?

terrillmoore commented 2 years ago

Yes, that's right. If the ctrl+C drops you back to the GDB prompt, and you see that you're in a different routine, info stack will show the back trace and will help me figure out how you got there (and maybe help me figure out which interrupt is stuck hot, if my guess is correct).

The guess is that an interrupt source is unmasked, interrupts are disabled; but the ISR that is called does not do anything to actually clear the interrupt source. So the interrupt is called over and over and the processor never makes forward progress in the background.

mhmayyan commented 2 years ago

Here is the stack:

#0  HAL_PCD_IRQHandler (hpcd=hpcd@entry=0x20000e38 <g_hpcd>)
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/system/Drivers/STM32L0xx_HAL_Driver/Src/stm32l0xx_hal_pcd.c:349
#1  0x08008d24 in USB_IRQHandler ()
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/cores/arduino/stm32/USB/usbd_conf.c:315
#2  <signal handler called>
#3  HAL_PCD_Init (hpcd=hpcd@entry=0x20000e38 <g_hpcd>)
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/system/Drivers/STM32L0xx_HAL_Driver/Src/stm32l0xx_hal_pcd.c:214
#4  0x08008d5c in USBD_LL_Init (pdev=pdev@entry=0x2000080c <hUSBD_Device_CDC>)
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/cores/arduino/stm32/USB/usbd_conf.c:348
#5  0x08009786 in USBD_Init (pdev=pdev@entry=0x2000080c <hUSBD_Device_CDC>, 
    pdesc=<optimized out>, id=id@entry=0 '\000')
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/system/Middlewares/ST/STM32_USB_Device_Library/Core/Src/usbd_core.c:122
#6  0x0800a4f6 in USBSerial::begin (this=this@entry=0x200007f8 <SerialUSB>)
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/cores/arduino/stm32/usb_serial.cpp:65
#7  0x080052fc in USBSerial::begin (baud=115200, this=<optimized out>)
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/cores/arduino/stm32/usb_serial.h:48
#8  setup ()
    at /home/USER/debugUSB_problem/debugUSB_problem.ino:23
#9  0x0800a1ce in main ()
    at /home/USER/.arduino15/packages/mcci/hardware/stm32/3.0.4/cores/arduino/main.cpp:52
terrillmoore commented 2 years ago

OK. So it certainly must be looping on a hot interrupt. I imagine that it's getting a continuous suspend interrupt. We can look at the registers from GDB after you press ctr+C:

x/4xw 0x40005c40

This will dump the USB CNTR, ISTR, FNR and DADDR registers.

mhmayyan commented 2 years ago

After it was stuck when sunning HAL_PCD_Init(&g_hpcd);

(gdb) x/4xw 0x40005c40
0x40005c40: 0x0000bf00  0x00002100  0x0000d000  0x00000000

And this what I got after it was running HAL_PCDEx_PMAConfig(&g_hpcd, 0x00, PCD_SNG_BUF, 0x10);

(gdb) x/4xw 0x40005c40
0x40005c40: 0x0000bf00  0x00002100  0x0000d800  0x00000000
terrillmoore commented 2 years ago

checking.

terrillmoore commented 2 years ago

The interrupt pending flags are 0x2100.

Both those bits are enabled in the control mask, (0xBF00).

The "USB_FNR" register has some bits indicating external state. 0xD800 means "RXDP is high, RXDM is high, LCK is false, and LSOF is 01".

RXDP and RXDM both high is an error -- that's why ERR is hot.

Now need to look at the schematics.

terrillmoore commented 2 years ago

Nothing obvious. The thing to do manually disable the ERR interrupt and see if the problem clears up. You do this by breaking in while it's looping, then saying:

p ((unsigned *)0x40005c40)[0] = 0x9F00

That will clear the ERR bit, and it will cause the interrupt not to recur. Or so I think, anyway...

Then say c to continue, and the program might come up. If not, ctrl+C again, and find out where you are with a backtrace, and check the USB registers again.

mhmayyan commented 2 years ago

Yes, it worked. It did not get stuck again after using the command p ((unsigned *)0x40005c40)[0] = 0x9F00

terrillmoore commented 2 years ago

OK, I will prepare a patch for the BSP. This is just an error in the BSP code we inherited.