LIFsCode / ELOC-3.0

Firmware for ELOC 3.0 Hardware
MIT License
3 stars 3 forks source link

Merge bugs #50

Closed EDsteve closed 9 months ago

EDsteve commented 10 months ago

@OOHehir @LIFsCode

After a few tests i found a few bugs which got introduced during the merge. I compared it with the FW before the merge which does not have the following bugs.

  1. Power consumption rose from 18mA to 30mA I have done a few measurements with different devices.
  2. Every time i connect with the app to the ELOC. A folder is created on the SD card. A folder should only be created when i start recording
  3. Recordings stop randomly after a few hours (due to restart?)
  4. Stand by mode is gone. Before the merge. The ELOC went into a stand by mode (5mA) after a while (60 sec?) of doing nothing

EDIT: Hmm. Somehow i can't delete a issue when i accidentaly created one (with Strg+ Enter). Wasn't finished writing but i guess i have to continue with it in a comment.

EDsteve commented 10 months ago

@OOHehir @LIFsCode Okay. More details:

  1. Power consumption I will add some logs and graphs about this soon.
  2. Empty folder Correction: This folder is actuallly created with every restart. Doesn't matter if i connect with the app or not.
  3. I have done three tests so far. The recording stoped between 1 and 3 hours. This problem is difficult to "debug" without correct time stamps. That seems not implemented yet. All the recordings always get the time from the day i flashed the firmware (I think?). The ELOC should get the time from the app every time the app connects to an ELOC. Is that something we can add soon? Joseph is ready to implement the command to send the time from app to the ELOC. Is there any tests you want me to do about it?

Hope that doesn't become a too big problem. Cheers ED

EDsteve commented 10 months ago

I added the missing stand by mode in my first post.

OOHehir commented 10 months ago

@EDsteve @LIFsCode I should have a bit of time to look at these issues this week.

  1. Regarding the power consumption is that with or without the inference running? I haven't resolved the sdk config differences & I think there was a change in the power config in main.cpp which might explain the difference.
  2. Folder, hopefully straightforward to resolve.
  3. Random crash, the idf-monitor can automatically parse the backtrace report against the firmware.elf generated during the build process. It can really help pinpoint the cause of the crash. If it doesn't work for you if you can include the firmware.elf corresponding to the crash it would help.
EDsteve commented 10 months ago

@OOHehir

  1. I measured 16KHz recording without AI running.
  2. Hope so too :)
  3. I have no clue how to use the idf-monitor. But will try to read about it. In the meantime: I have catched the log file when the recordings stops. That should give some insights: Rec_stop_log.txt
OOHehir commented 10 months ago

@EDsteve looks like you have the backtrace decode already setup (using Arduino IDE?), your log points to

7 0x400e1df0:0x3ffbc000 in app_main at src/main.cpp:1088

Could prove very useful!

LIFsCode commented 10 months ago

@OOHehir @EDsteve

Regarding the power consumption is that with or without the inference running? I haven't resolved the sdk config differences & I think there was a change in the power config in main.cpp which might explain the difference.

I notices the following differences:

https://github.com/LIFsCode/ELOC-3.0/issues/7 0x400e1df0:0x3ffbc000 in app_main at src/main.cpp:1088

Could prove very useful!

The WavFile Writer mode is continously created using new in the while(true) loop

    while (true)
    {
        auto new_mode = new WAVFileWriter::Mode;

And it is newer destroyed as the pointer will just be overwritten, but not the instance. This will cause the heap to explode.

Better keeping it on the stack and just remove the new. that should solve it :)

OOHehir commented 10 months ago

@LIFsCode @EDsteve The bug should hopefully be resolved now. Impressive how powerful backtrace can be sometimes to track down bugs..

I should have more in the next day or two

OOHehir commented 10 months ago

To do:

OOHehir commented 10 months ago

@LIFsCode

Do you have a link for this code? Can't seem to find it.

  • ElocSystem::pm_check_ForRecording() is not called is not called anymore prior to recording. This is required to fix the >32 kHz noise bug
OOHehir commented 10 months ago

To do:

* [ ]  ElocSystem::pm_check_ForRecording() is not called is not called anymore prior to recording. This is required to fix [the >32 kHz noise bug](https://github.com/LIFsCode/ELOC-3.0/issues/pm_check_ForRecording) @OOHehir will resolve.

* [ ]  Since the tasks (EI & WAV Writer) are not blocked via Queue, they will be frequently waked up from FreeRTOS this could affect the CPU clocking scheme. @OOHehir will investigate using the FreeRTOS [event group ](www.freertos.org/FreeRTOS-Event-Groups.html)to more efficiently wake tasks

* [x]  The Buffer of the WAV Writer is [allocated in PSRAM](https://github.com/LIFsCode/ELOC-3.0/blob/84e9cff7429ac961effb74d5ac46b6502d974c45/eloc610LowPowerPartition/lib/wav_file/src/WAVFileWriter.cpp#L28) @OOHehir do you see any problems in moving this to internal RAM Heap? PSRAM has significant higher power footprint than the internal due to the additional SPI interface involved. I believe internal RAM should be quicker also, @OOHehir will investigate.

@EDsteve @LIFsCode

This commit adds options in 'project_config.h':

/////////////////////////////////// Memory Related configurations ///////////////////////////////////

// Place buffers in PSRAM, otherwise in RAM
// Possible power savings & performance improvement
#define I2S_BUFFER_IN_PSRAM
#define WAV_BUFFER_IN_PSRAM     // too big to fit in RAM??
#define EI_BUFFER_IN_PSRAM

Moving the buffers for I2S & EI works OK (I haven't done any power measurements).

However moving the WAV buffer to RAM currently fails. I suspect its because the wav buffers are are currently allocated near the end of the setup code & at that point there isn't sufficient chunks of memory big enough (although there may be more than required memory but in fragmented state). I'll work task 2 & come back to this if it looks like it'll be a possible power saving option.

OOHehir commented 10 months ago

@EDsteve @LIFsCode This commit changes the wav writer from a 'polling' delay to one 'woken' from the I2S task. I don't see a huge improvement (just looking at the performance monitor output) but hopefully power consumption will be reduced somewhat.

LIFsCode commented 10 months ago

@OOHehir moving the HEap from PSRAM to RAM will be the biggest effect on power consumption keeping the CPU idle ~25% (best achieved by keeping all tasks in blocked state most of the time). Will definetely help but with a lower effect.

Since the EI will increace power consumption anyway due to higher computational performance I would suggest:

While the EI buffer should remain in PSRAM.

Thus we can have a low power mode with EI disabled and a high power mode with EI enabled.

EDsteve commented 10 months ago

@OOHehir @LIFsCode After a bit of testing with this commit some results:

No problems with long term recordings any more. I have recorded several times more than 12 hours without issues.

Bad news: The power consumption didn't change much. All measurements with 16K recordings and BT off.

Before Merge -> 19mA

Before_Merge_1s Before_Merge_200ms

After merge -> 30mA:

After_Merge_1s After_Merge_200ms

LIFsCode commented 10 months ago

Can you try with disabled logging to SD card just to keep this out.

EDsteve commented 10 months ago

@LIFsCode Good point. I just disabled it. But can't see a difference so didn't upload the new pictures. If you need. Let me know. Config i used:

{
  "device": {
    "fileHeader": "not_set",
    "locationCode": "unknown",
    "locationAccuracy": 99,
    "nodeName": "ELOC_SD_CONFIG_Samsung"
  },
  "config": {
    "secondsPerFile": 60,
    "cpuMaxFrequencyMHZ": 80,
    "cpuMinFrequencyMHZ": 10,
    "cpuEnableLightSleep": true,
    "bluetoothEnableAtStart": true,
    "bluetoothEnableOnTapping": true,
    "bluetoothEnableDuringRecord": true,
    "bluetoothOffTimeoutSeconds": 60,
    "logConfig": {
      "logToSdCard": false,
      "filename": "/sdcard/log/eloc.log",
      "maxFiles": 10,
      "maxFileSize": 5120
    }
  },
  "mic": {
    "MicType": "ns",
    "MicBitShift": 11,
    "MicSampleRate": 16000,
    "MicUseAPLL": true,
    "MicUseTimingFix": true
  }
}
OOHehir commented 10 months ago

@OOHehir moving the HEap from PSRAM to RAM will be the biggest effect on power consumption keeping the CPU idle ~25% (best achieved by keeping all tasks in blocked state most of the time). Will definetely help but with a lower effect.

Since the EI will increace power consumption anyway due to higher computational performance I would suggest:

* I2S_BUFFER_IN_PSRAM --> to RAM

* WAV_BUFFER_IN_PSRAM --> to RAM

While the EI buffer should remain in PSRAM.

Thus we can have a low power mode with EI disabled and a high power mode with EI enabled.

@LIFsCode Sounds like a good approach, I'll work on it today.

@EDsteve To implement this change the wav buffer will (very) likely needed to be allocated once only at boot time. This means that it won't be possible to change the wav file recording time without rebooting the device. I don't think this will be an issue, but need to confirm?

EDsteve commented 10 months ago

@OOHehir Power consumption has the highest priority. If the user must reboot to change recording time in order to achieve lower power consumption. Yes. That's ok :)

OOHehir commented 10 months ago

@EDsteve @LIFsCode I've just pushed a new commit to master which now, by default (set in project_config.h) places the I2S & WAV buffers in RAM. Looking at the performance monitor output the change looks promising, the wav_writer task has moved from approx 7 to 10% to approx 1%. I've only tried it using the eps32dev build, rather that the task including the EI feature.

The downside is that currently the WAV buffer appears close to its maximum size, increasing it by any significant amount generates errors about data not fitting in particular sections.

OOHehir commented 10 months ago

@LIFsCode @EDsteve It looks like the wav buffer size is limited to approx 9700 (int16_t), anymore & the inference appears to run out of memory & fails to run.

LIFsCode commented 10 months ago

@OOHehir that sounds very promising and matches the theorie that the PSRAM Spi access has a major impact.

Concerning the WavFile block size: I would guess we can reduce the size even further without major impact as long as it stays within these constraints:

Increasing buffer size will reduce context switches, thus reducing load & power, but I wouldn't expect much effect beyond 1024-2048 buffer size.

Actually I think the optimal size would be the DMA buffer size (assuming it's multiple of 512).

Concerning the heap usage: I would suggest allocating wavfilewriter etc statically, to get memory checks at compile time. @OOHehir any concerns about that? It's not urgent and I'm free to do it, just didn't want to interfer with your current work and synchronize with you.

OOHehir commented 9 months ago

@LIFsCode The current DMA buffer length is 1000 bytes for some reason & the current DMA buffer count is 18 which seems, well, strange also. Not sure if there's a particular reason for that? Perhaps, as you suggest, 2 or 3 DMA buffers of 2028 & matched in the WAVFileWriter would be a better fit? The only other factor I can think of is if we want the core to enter sleep during writes & the min amount of time for that to occur.

With regards the changes you suggested I've started with the wav_writer.

LIFsCode commented 9 months ago

@OOHehir sleep mode (deep & light) is not available when I2S is active, because the I2S module requires the system clock.

So the only effect should the reduced context switches.

EDsteve commented 9 months ago

@OOHehir @LIFsCode Due to not enough time i was only able to do power consumption tests so far. And YES. We are back on track. It is on pair with the consumption before the merge. Awesome. :)

About stand by (sleep) - which is not working after the merge yet When i measure the stand by consumption (using the FW before the merge). It shows around 6mA. From what i have read here. The power consumption can be less than 1mA if the MCU is paused. Is that something we can achieve as well?

OOHehir commented 9 months ago

@EDsteve @LIFsCode ED, It's great to hear the power consumption looks better.

With regards sleep, from @LIFsCode previous comment, it looks like no sleep is possible while I2S sampling is occurring. This would mean that sleep is possible if both the wav writer & EI inference are NOT occurring. Is this a situation you see possible? I see code for Deep sleep in main.cpp but haven't tried it.

EDsteve commented 9 months ago

@OOHehir Yes. That is fine. The sleep mode is used in situations if someone forgets to turn off the device, carries the ELOC days through the jungle and arrives with an empty battery :)

OOHehir commented 9 months ago

@EDsteve OK, understandable.

On what circumstances should the ELOC enter & exit light and/or deep sleep? E.g. 30 secs after startup & wav writer & EI inference disabled? Bluetooth status?

How should it exit this state? I think GPIO0 button press or reboot is the only option from deep sleep but @LIFsCode may know more..

EDsteve commented 9 months ago

@OOHehir Yes. You got it right. If the device is ON, and not doing anything (No BT connection, no recording, no EI) it should go to sleep. Let's say after 5 minutes or so for now. With the firmware before the merge. The device wakes up from the knock sensor on GPIO12 (This might not help in the walking-through-jungle scenario). But let's keep it for now like that.

If i remember correctly some pins are able to wake up the ESP32 from deep sleep. Maybe that's the reason the ESP32 still uses 6mA instead of <1mA in deep sleep? @LIFsCode

LIFsCode commented 9 months ago

Hi @EDsteve @OOHehir Let me intervene at this point ;-)

deep sleep @EDsteve asked for a intruder alarm based on accelerometer repeated action that will cause the buzzer to constantly beep. If that's still the case, I doubt that anyone would carry around the ELOC with forgetting to turn off.

Anyway if the deep sleep option remains a requirement we should be very careful with condition for entering deep sleep to avoid other problems, e.g. a ranger won't notice that the device entered deep sleep and wandering why it won't respond.

Concerning wakeup source: I assume a simple power cycle by the on/off switch would be the more stable (I know and less elegant ^^)variant. It avoids accidentally turning it on by something pressing on the button in the backpack. The on/off switch is a bit more robust.

Light sleep This is already enabled by the power config. The freeRTOS is automatically entering light sleep, once no wake locks are taken. Light sleep means that the main system clocks are stopped and only the low power RTC clock is running.

That means the following modules prevent light sleep:

Besides that there is an effect of the PSRAM on power during light sleep, but I would treat this as negligible as it's in the range of a few 10s uA.

Sorry it has been a long post, but the power management is quite a topic. I hope I've been comprehensible

OOHehir commented 9 months ago

@EDsteve Based on the above do you still think this feature is a requirement?

@LIFsCode If this is a requirement perhaps this could be implemented by a simple task that:

OOHehir commented 9 months ago

@OOHehir @LIFsCode

After a few tests i found a few bugs which got introduced during the merge. I compared it with the FW before the merge which does not have the following bugs. .........

  1. Every time i connect with the app to the ELOC. A folder is created on the SD card. A folder should only be created when i start recording .......

This should now be resolved in the latest commit.

EDsteve commented 9 months ago

@LIFsCode @OOHehir

Intruder alarm:

I doubt that anyone would carry around the ELOC with forgetting to turn off.

I had the same thought when making the buzzer table :) Therefore I was already thinking about a setting in the app "Arm device during record. ON/OFF". That would only have the device armed while recording after it's already set up and not while setting up or carrying around for example. It's not a mandatory feature. But I would love to test that because i am sure it will help with monkeys and humans.

Sleep: I have been out in the field with rangers. They will deal with 50 and more devices. Forgetting to turn some devices off is a common mistake. Then the battery gets drained while sitting in the basecamp or on the way in the jungle. Its no fun to realize that you carry around a few KG of empty batteries for a few days. (Not all ELOCs use solar panels)

But your doubts are true too. It can be confusing for the rangers when the device doesn't respond. It's similar with the BT time-out. Easy to get confused when i don't find the ELOC in the BT list of the app any more. And wonder what happened. (Yes. The app is online, but still work in progress.) That's because BT timout is now set to 60 sec just for testing. But later will be set to 30 minutes or more to give enough time setting up the device. (BLE could be always on and would make life easier in that regard. But that's a huge task if i see that right.) Also the sleep mode should happen after one hour or so and not after 5 minutes as i sugested earlier for testing. That should reduce the chances of confusion.

Waking up the device from the ON/OFF switch sounds good to me. I think that is the first thing everybody does when something doesn't respond: Restart :)

Deep sleep or intruder alarm are handy features which would make a difference in my opinion, but this feature might be more impartant which also uses deep sleep.

@LIFsCode Now i feel more sorry for a even longer post :) @OOHehir Thanks for the new commit. I will check it out soon.

LIFsCode commented 9 months ago

@EDsteve @OOHehir ok let's keep this out of this issue it's getting too mix up and actually has nothing to do with the AI merge.

Let's keep the deep sleep discussion all together here

OOHehir commented 9 months ago

I think this task is complete, there are some power/sleep issues but now in a different issue so I'll close this to tidy things up