Open Vamshigopal opened 1 month ago
I've taken all the recent fixes went in under sound/core/memalloc.c still the issue comes.
5365727b399276 (HEAD) ALSA: memalloc: Workaround for Xen PV fe074ccf1d6035 ALSA: memalloc: don't use GFPCOMP for non-coherent dma allocations ce7ba60e2f4f8f ALSA: memalloc: don't pass bogus GFP flags to dmaalloc* 809ca3aec74894 ALSA: memalloc: Allocate more contiguous pages for fallback case 0165554146733c ALSA: memalloc: Try dma_alloc_noncontiguous() at first 2c69c6c6950659 ALSA: memalloc: Don't fall back for SG-buffer with IOMMU 7c62355c56949d ALSA: memalloc: use __GFP_RETRY_MAYFAIL for DMA mem allocs 5bb7d534b7bec1 ALSA: hda: Once again fix regression of page allocations with IOMMU 594e13a86ff750 ALSA: doc: Drop snd_dma_continuous_data() usages fb786627247b77 ALSA: memalloc: Drop special handling of GFP for CONTINUOUS allocation 4c6fdc8ad281a0 ASoC: Intel: sst: Switch to standard device pages a5dd134ee8a619 ALSA: pdaudiocf: Drop superfluous GFP setup ace17309a5c63d ALSA: vx: Drop superfluous GFP setup 74f65b9821734e ALSA: memalloc: Revive x86-specific WC page allocations again 36e977f79ab0a9 ALSA: memalloc: Fix missing return value comments for kernel docs c3ec3d3224e6cc ALSA: memalloc: Drop x86-specific hack for WC allocations
cc: @kv2019i @plbossart @bardliao @sathya-nujella
Can we try with a non-Chrome kernel to make sure this platform works first, before diving in the backport issues?
BTW this is issue number FIVE THOUSAND. I don't know if I should cry or laugh.
Can we try with a non-Chrome kernel to make sure this platform works first, before diving in the backport issues?
I have used kernel 6.9.0-rc7 from https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/heads/merge/continuous/chromeos-kernelupstream-6.9-rc7 Its same has upstream kernel only few additional chrome specific patches to support chrome boot. With this kernel i see the issue reproduces with same signature,
[ 1976.217652] perf: page allocation failure: order:4, mode:0xdc0(GFP_KERNEL|GFP_ZERO), nodemask=(null),cpuset=/,mems_allowed=0
[ 1976.217668] CPU: 3 PID: 11754 Comm: perf Not tainted 6.9.0-rc7-g4157e5c9501e-dirty #1 47154153e3152498d7551147b11cdd2fdbee3ec5
[ 1976.217673] Hardware name: Dell Inc. Drallion/Drallion, BIOS Google_Drallion.12930.48.0 04/21/2020
[ 1976.217675] Call Trace:
[ 1976.217678]
Thanks @Vamshigopal, this is helpful in that it's obviously not a backport issue, but the trace does not really point to a specific audio driver doing bad things. It's not even the SOF driver used but snd-hda-intel.
It seems to be a problem with memory management, notifying @tiwai @kv2019i since that's changed a lot since initial CML Chromebooks came out.
Those are memory allocation failures of higher orders (4) by other code, and it implies that the system memory is highly fragmented. The only concern is whether this fragmentation happened by some memory leaks. If so, the leaks have to be fixed.
@tiwai we see this issues in the field devices, not sure on exact evironment. To reproduce faster i'm using https://github.com/stressapptest/stressapptest https://chromium.googlesource.com/chromiumos/platform/factory/+/HEAD/py/test/pytests/stressapptest.py
Can you suggest any experiments / debug prints to narrow down the issue further.
It's no bug, per se, if it's really the result of a highly fragmented system. The allocation failure of higher order pages is no fatal error in general.
You can try to check whether there are memory leaks, e.g. examining the actual free pages, for example. Or try some kernel configs for debugging memory leaks.
Thanks @tiwai for suggesttions, I've added kernel config CONFIG_DEBUG_KMEMLEAK=y to check memory leaks , but is see this kmemleak: Memory pool empty, consider increasing CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE How much ever pool size we increase to , still we get same warning.
Can you please suggest any other kernl configs for memmory leaks and can you suggest how we can examine actual free pages ?
While the test is running using top ,i can see 60-80 mb is free, but not sure how many free pages we have.
Also i see CONFIG_COMPACTION is enable, This option enables memory compaction in the kernel, which attempts to reduce fragmentation by merging smaller free blocks into larger ones.
Describe the bug On CML chromebook device with legacy HDA driver, When system goes to low memory , we see page allocation failure for audio and audio stops working. We also see kernel crash after page allocation failures
To Reproduce
Boot the chromebook Restrict the system memory to 4gb Run memory intense workloads Paralley run youtube audio playback
Environment Kernel Branch: https://chromium.googlesource.com/chromiumos/third_party/kernel/+/refs/tags/v5.15.152 Platform: CML
Logs dmesg.log
Screenshots or console output