zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.82k stars 6.6k forks source link

spi_nor: enter_dpd() called twice when CONFIG_SPI_NOR_IDLE_IN_DPD=y #70146

Closed lxbrz closed 5 months ago

lxbrz commented 7 months ago

Description

In spi_nor_configure at the very end enter_dpd()is called twice when CONFIG_SPI_NOR_IDLE_IN_DPD=y which causes the flash chip to exit DPD on the second call and stay in stand-by mode, thus consuming way more current.

This will be fixed by https://github.com/zephyrproject-rtos/zephyr/pull/69640 and I create this issue only for reference for others that might search issues for this problem.

I have a board here with nothing on it but an nrf52840 and a flash chip (macronix MX25R8035F) where I am testing/measuring the current consumption with a high accuracy power supply and ampere meter. With the firmware just sleeping and all peripherals powered off I get a current draw of ~3 uA at 3V which aligns with the expected value of 2,35 uA from the nrf52840 product specification. This is with CONFIG_SPI_NOR_IDLE_IN_DPD=n and manually suspending the flash and the spi bus (pm_device_action_run(device, PM_DEVICE_ACTION_SUSPEND)).

I noticed that with CONFIG_SPI_NOR_IDLE_IN_DPD=y the current draw is around 10 uA, so the flash apparently is not in deep power-down. I was able to trace this down the root cause: enter_dpd() is called twice, once by the last release_device() from spi_nor_configure() and a second directly from the bottom of spi_nor_configure(). Sending theSPI_NOR_CMD_DPDtwice causes the flash to exit deep power-down. Funny enough, sending the command 3 times will let the flash go into deep-power down again. Sending an even number ofSPI_NOR_CMD_DPDlet's the device be in stand-by mode, odd number of SPI_NOR_CMD_DPD` puts it in deep power-down πŸ˜„

To Reproduce

Expected behavior

Impact

Logs and console output

Environment (please complete the following information):

Additional context

github-actions[bot] commented 7 months ago

Hi @lxbrz! We appreciate you submitting your first issue for our open-source project. 🌟

Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. πŸ€–πŸ’™

github-actions[bot] commented 5 months ago

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

lxbrz commented 5 months ago

Since https://github.com/zephyrproject-rtos/zephyr/pull/69640 is still open, this issue should also be kept open and the stale label removed please.

krbvroc1 commented 5 months ago

@mjchen0 @de-nordic

After spending multiple days trying to figure out why current consumption was too high on a custom board, I ran into this same issue.

I also found this was reported as a problem and apparently ignored on the Nordic forums over 3 1/2 years ago! See https://devzone.nordicsemi.com/f/nordic-q-a/68469/zephyr-spi-nor-flash-driver-incorrectly-sends-dpd-command-twice-at-end-of-init

@aescolar - I am not sure why this issue has been mark as low priority. Due to this bug, battery powered devices that believe they have put the Macronix family of flash devices into Deep Power Down state will consume much higher current - according to the datasheet the low power mode current is 5uA typical / 24uA max. The Deep Power-down Current is 7nA typical / 350nA max. So I don't think this is low priority at all.

@lxbrz Did you find this fix gave you the advertised savings? After applying the pull request, I did verify on a logic analyzer that only a single Deep Power Down (DBh) was issued, rather than twice, but I found the flash device is still consuming about 8uA. That was down from the 14uA without the fix. I don't know if there are other issues with Zephyr / Nordic not powering off the SPI or what, but is not near the advertised savings. What are you doing with your other flash pins - do you have WP# pulled high or are you controlling it from the 52840? Are you doing anything else special to make sure the SPI bus is actually powered off or something?

lxbrz commented 5 months ago

@krbvroc1 IIRC the current draw went down indeed, but was still higher than expected. We ended up switching to QSPI, which is when we found the expected current savings. I could indeed have something to do with the SPI bus not being powered down or something, but unfortunately I can't look into this problem further. Good luck!

I also found this was reported as a problem and apparently ignored on the Nordic forums over 3 1/2 years ago!

Ooof, that reduces my hope that the various bugs I reported there will ever be fixed.

de-nordic commented 5 months ago

Ooof, that reduces my hope that the various bugs I reported there will ever be fixed.

Sorry, but we often lack capacity to cover every problem, we do what we can In open source everyone is welcome to help. It would be really nice the details you have and expertise could be used in helping in areas you have experience in.

lxbrz commented 5 months ago

Thank you for your effort and sorry for the careless comment above. I definitely try to contribute back in the areas I'm working on. Unfortunately the issues I reported in the devzone are definitely out of my working area (nrf connect vscode plugin). Sorry again for the exaggeration, maybe someone will be able to pick up the issues. Thanks!

de-nordic commented 5 months ago

Thank you for your effort and sorry for the careless comment above. I definitely try to contribute back in the areas I'm working on. Unfortunately the issues I reported in the devzone are definitely out of my working area (nrf connect vscode plugin). Sorry again for the exaggeration, maybe someone will be able to pick up the issues. Thanks!

No problem, I get paid for most of stuff I do here (it is my hobby too), so on the same note: Thanks for reporting the issues you have found, a lot of people do not do that, sometimes even if we help them find them first.

What I rather mean is that the system and number of users is growing and it is very easy to miss something. Also note that the Zephyr is just a base that is being utilized with devices that you have expertise and access we do not, and often extend what Zephyr base support in devices we have, like spi nor things, and you will may get completely different corner cases from what previous contributors have predicted or got into.

de-nordic commented 5 months ago

Btw. we have PR escalation process, however scary that sounds, here https://docs.zephyrproject.org/latest/contribute/contributor_expectations.html#pr-review-escalation that may help when you do not see things moving.

krbvroc1 commented 5 months ago

After additional testing with the associated pull request, I found that the still high current consumption is due to the spi_nrfx_spim device. Even though CONFIG_PM_DEVICE=y is set, Zephyr does not appear to ever turn off the SPI peripheral when it is not in use. The documentation related to the PM is all over the place so I have no idea the proper method to manage this. When I manually insert a call to

  const struct device *cons = DEVICE_DT_GET(DT_NODELABEL(spi2));
  pm_device_action_run(cons, PM_DEVICE_ACTION_SUSPEND);

the current consumption drops down to expected deep power down levels, I assume because the pins become tri-stated.

de-nordic commented 5 months ago

After additional testing with the associated pull request, I found that the still high current consumption is due to the spi_nrfx_spim device. Even though CONFIG_PM_DEVICE=y is set, Zephyr does not appear to ever turn off the SPI peripheral when it is not in use. The documentation related to the PM is all over the place so I have no idea the proper method to manage this. When I manually insert a call to

  const struct device *cons = DEVICE_DT_GET(DT_NODELABEL(spi2));
  pm_device_action_run(cons, PM_DEVICE_ACTION_SUSPEND);

the current consumption drops down to expected deep power down levels, I assume because the pins become tri-stated.

Yes, the SPI NOR controls the power of external device only, it does not control peripheral nor bus controller.

Can you open another issue for this?

I would like to close this issue since the PR that fixes the topic of the issue has already been merged.

krbvroc1 commented 5 months ago

I wasn't really raising a new issue above, I was documenting what I found so when other developers stumble across this same issue, there is some helpful info. Just like how I stumbled across this same issue.

de-nordic commented 5 months ago

@lxbrz Can you close the issue if you find it fixed?

lxbrz commented 5 months ago

Since #69640 was merged, this issue can be closed. Thanks!