MartyMacGyver / ESP32-Digital-RGB-LED-Drivers

ESP32 Digital RGB(W) LED Drivers
MIT License
260 stars 51 forks source link

RMT conflict with esp32-owb #22

Open lukecyca opened 6 years ago

lukecyca commented 6 years ago

If I use this library in the same project as esp32-owb by @DavidAntliff, I get the following reboot loop as soon as a hit digitalLeds_initStrands.

Guru Meditation Error: Core  1 panic'ed (LoadProhibited)
. Exception was unhandled.
Core 1 register dump:
PC      : 0x40084c86  PS      : 0x00060033  A0      : 0x80081ab4  A1      : 0x3ffb0bc0
0x40084c86: rmt_driver_isr_default at /Users/luke/Code/esp32/esp-idf/components/driver/./rmt.c:769 (discriminator 1)

A2      : 0x0000001a  A3      : 0x04000000  A4      : 0x00000000  A5      : 0x00000002
A6      : 0x04000000  A7      : 0x3ffba450  A8      : 0x3ff56000  A9      : 0x00000001
A10     : 0x00000001  A11     : 0x00000000  A12     : 0x80085fb9  A13     : 0x00000001
A14     : 0x00060021  A15     : 0x00060f23  SAR     : 0x00000006  EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000010  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0xffffffff
Core 1 was running in ISR context:
EPC1    : 0x40084c86  EPC2    : 0x00000000  EPC3    : 0x00000000  EPC4    : 0x400d2120
0x40084c86: rmt_driver_isr_default at /Users/luke/Code/esp32/esp-idf/components/driver/./rmt.c:769 (discriminator 1)

0x400d2120: esp_vApplicationIdleHook at /Users/luke/Code/esp32/esp-idf/components/esp32/./freertos_hooks.c:85

Backtrace: 0x40084c86:0x3ffb0bc0 0x40081ab1:0x3ffb0bf0 0x400824ad:0x3ffb0c10 0x400e4c60:0x00000000
0x40084c86: rmt_driver_isr_default at /Users/luke/Code/esp32/esp-idf/components/driver/./rmt.c:769 (discriminator 1)

0x40081ab1: shared_intr_isr at /Users/luke/Code/esp32/esp-idf/components/esp32/./intr_alloc.c:773

0x400824ad: _xt_lowint1 at /Users/luke/Code/esp32/esp-idf/components/freertos/./xtensa_vectors.S:1105

0x400e4c60: i2c_master_cmd_begin at /Users/luke/Code/esp32/esp-idf/components/driver/./i2c.c:1187

Rebooting...

The relevant sections of code are:

#define WS2812_PIN (27)
#define WS2812_PIN_MASK (1ULL<<WS2812_PIN)
void ws2812_task(void *pvParameter)
{

    gpio_config_t io_conf;
    io_conf.intr_type = GPIO_PIN_INTR_DISABLE;
    io_conf.mode = GPIO_MODE_OUTPUT;
    io_conf.pin_bit_mask = GPIO_SEL_27;
    io_conf.pull_down_en = 0;
    io_conf.pull_up_en = 0;
    gpio_config(&io_conf);

    strand_t strands[] = {
        {.rmtChannel = RMT_CHANNEL_2, .gpioNum = WS2812_PIN, .ledType = LED_WS2812B_V3, .brightLimit = 255, .numPixels =  3,
        .pixels = NULL, ._stateVars = NULL}
    };

    if (digitalLeds_initStrands(strands, 1)) {
        ESP_LOGE(CM02_TAG, "Failed to initialize WS2812");
        vTaskDelete(NULL);
        return;
    }

    uint8_t x = 0;
    while (true) {
        strands[0].pixels[0] = pixelFromRGB(255-x, x, 0);
        strands[0].pixels[1] = pixelFromRGB(0, 255-x, x);
        strands[0].pixels[2] = pixelFromRGB(x, 0, 255-x);
        x++;
        digitalLeds_updatePixels(&strands[0]);
        vTaskDelay(30/portTICK_PERIOD_MS);
    }
}

#define DS18B20_PROBE_PIN 16
static float temp_probe_value = -1;
static void temp_probe_task(void *pvParameter) {
    DS18B20_ERROR err;

    OneWireBus * owb;
    owb_rmt_driver_info rmt_driver_info;
    owb = owb_rmt_initialize(&rmt_driver_info, DS18B20_PROBE_PIN, RMT_CHANNEL_1, RMT_CHANNEL_0);
    owb_use_crc(owb, true);

    DS18B20_Info * ds18b20_probe;
    ds18b20_probe = ds18b20_malloc();
    ds18b20_init_solo(ds18b20_probe, owb);
    ds18b20_use_crc(ds18b20_probe, true);
    ds18b20_set_resolution(ds18b20_probe, DS18B20_RESOLUTION_12_BIT);

    while (1) {
        ds18b20_convert_all(owb);

        // Wait 1s, and then wait longer if the conversion isn't done
        vTaskDelay(1000/portTICK_PERIOD_MS);
        ds18b20_wait_for_conversion(ds18b20_probe); // block

        err = ds18b20_read_temp(ds18b20_probe, &temp_probe_value);
        printf("  %.1f    %d errors\n", temp_probe_value, err);
        // FIXME: check errors
    }
}

I have taken care to specify different RMT channels, and I think the pins I'm using for each peripheral or valid. If I disable one, the other works perfectly.

DavidAntliff commented 6 years ago

Interesting that you see this - I've been seeing a similar problem with my set of five DS18B20 devices on a single one-wire bus, but only when they are on long cables (a few metres each). It'll run fine for a day or two, then crash, but then continue to crash unless manually reset. The thing is, I'm also using a second RMT channel - to gate a signal for frequency counting. I wonder if there's something in the driver that comes to light when more than one channel is in use?

lukecyca commented 6 years ago

@DavidAntliff thanks for the quick reply! Awesome libraries BTW (I'm using your I2c-lcd1602 one too).

I am using a single DS18B20 on a 3m cable. For me it is not intermittent. It crashes reliably in the same place. I tried adjusting the RMT channels used by each library, to no effect.

Let me know if there are any troubleshooting steps that come to mind. In the meantime I switched to the GPIO version of owb and everything works great.

DavidAntliff commented 6 years ago

@chmorgan actually wrote the RMT driver, so he might be interested in helping us out with this. If you're able to crash it 100% then that could be useful. Does it still crash if the LED strip(s) are not physically connected? If so, perhaps I can run your code on my board and see if I see the same thing?

I wonder if the physical length of the DS18B20 cable causes an issue. Does it crash if you put your DS18B20 on a really short cable (or use a discrete 3-pin device rather than the sealed version?). What if you modify your code a bit to make it attempt to communicate with the DS18B20 even if it's disconnected? What OWB resistor value are you using?

chmorgan commented 6 years ago

@lukecyca I've seen that those panics don't always point out the appropriate line that caused the panic. In this case rmt.c:769 is the end of rmt_wait_tx_done().

It's certainly possible there is something up in the library. I did take a significant amount of care with the rmt changes, especially around ensuring the structures and structure offset approach was correct but again that doesn't mean there isn't an issue there.

I don't see any usage of rmt_wait_tx_done() in @DavidAntliff's library. Are you calling this function from another section of code not included in your original post? I'm not familiar with that lcd library.

lukecyca commented 6 years ago

Does it still crash if the LED strip(s) are not physically connected? Does it crash if you put your DS18B20 on a really short cable

I've actually just got 3 discrete WS2812 packages soldered to the PCB (to be used as status indicators). So unfortunately it's not too easy to disconnect them. However, I flashed my code onto a devboard with nothing connected, and I get identical results.

I don't see any usage of rmt_wait_tx_done() in @DavidAntliff's library. Are you calling this function from another section of code not included in your original post? I'm not familiar with that lcd library.

Based on a quick grep, there isn't any call to rmt_wait_tx_done() from anything in my source tree (either my code or any components I'm using). To my knowledge the only things in my project using RMT at all are this library (ESP32-Digital-RGB-LED-Drivers) and esp32-owb.

If either of you want a copy of my project, I can email it to you. You should be able to flash it to any ESP32 and reproduce this issue. You can find my email on my profile page.

MartyMacGyver commented 6 years ago

I'm sorry to hear of this problem. I'm not sure what to add to this yet since I've not encountered the problem and I don't have any deep debugging equipment for my devices handy. However, if two different libraries are attempting to manage the RMT peripheral it might lead to a conflict.

DavidAntliff commented 6 years ago

I haven't had a chance to test Luke's code yet, however I have noticed today that if some other part of my code crashes (I was causing crashes by deliberately breaking the TCP stack) then some of the time (not always, but quite often) the RMT peripheral driver fails to initialise properly on the next reboot, which then causes subsequent reboots until intervention is made.

So this makes me wonder if it's possible for the RMT peripheral to get caught in a bad state that the init code isn't properly dealing with. Perhaps for this particular issue, one of these two RMT-utilising tasks causes such a state, which the other task then bumps into it when it tries to initialise.

Just speculation at this point though.

DavidAntliff commented 6 years ago

I'm currently investigating an issue that affects programs that use more than one RMT channel (which esp32-owb does), and it might be related:

https://github.com/espressif/esp-idf/issues/1815

Try adding this to the very start of the program:

periph_module_disable(PERIPH_RMT_MODULE);
periph_module_enable(PERIPH_RMT_MODULE);

If that prevents the repeated crashing, then it suggests you may be seeing the same problem I'm currently chasing. It's probably not a very good long-term fix though, as it's a bit of a sledgehammer. I'm working on a PR to fix it properly, once I get some feedback from Espressif on the best direction to take.

pkruger commented 5 years ago

@MartyMacGyver Has this issue been resolved? I'm running into exactly the same thing...

DavidAntliff commented 5 years ago

@pkruger the issue I mentioned on March 28 2018 was fixed by Espressif in 3.0 but may have reared its head again in newer versions. What IDF version are you using? Have you tried the periph_module_disable/enable workaround?

MartyMacGyver commented 5 years ago

I was cleaning out old tickets - I've reopened this.

I'm not sure if this was ever resolved in the ESP-IDF - from the perspective of my library there's not much to be done. It is possible to use the system interrupt handler and that might have a positive effect, but it incurs a significant performance penalty (FastLED has implemented that alternative, but again, that assumes this would have an effect on the problem seen - give it a try).

The specific setting is described here: https://github.com/FastLED/FastLED/blob/bbcbb4017ced2f63e521bc1cf7e7b9669da1b1cb/platforms/esp/32/clockless_rmt_esp32.h#L43-L59

Personally, I'd use a DS2482 or similar and let it manage the protocol conversion - using the RMT channels for various purposes at once it tricky at best, especially when doing high-bandwidth operations like driving LEDs.