Closed guanchar closed 8 years ago
Analysis:
[submit_iso_transfer] submiturb failed error -1 errno=12
tells us ENOMEM is passed from the kernel. And this time it is page allocation failure: order:7
inproc_do_submiturb
, which requested a large 512KB memory block, when there was none left (0*512kB).
[ 1417.271361] Node 0 DMA32: 243*4kB (UEM) 62*8kB (UEM) 195*16kB (UEM) 120*32kB (UEM) 47*64kB (UE) 36*128kB (UEM) 16*256kB (E) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 20140kB
[ 1417.271381] Node 0 Normal: 44*4kB (UEM) 138*8kB (UE) 101*16kB (UEM) 52*32kB (UEM) 18*64kB (UEM) 5*128kB (UEM) 1*256kB (U) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6608kB
Each iso transfer needs a contiguous kernel memory. Our transfers request 8 packets of 0x8400 bytes, which is 264KB. Allocating such large memory block can really fail at any time, which means libusb_submit_transfer() can fail at any time because of this reason, and once it fails, it is very likely to fail again. And this is unavoidable if the kernel allocates memory on each transfer. This can be mitigated to some extent by using smaller transfers, but 8 packets per transfer are already small.
The same issue is reported on linux-usb mailing list http://thread.gmane.org/gmane.linux.usb.general/132939. They proposed to introduce a zerocopy IO mechanism in usbfs to remove the need to allocate memory on each new transfer http://thread.gmane.org/gmane.linux.usb.general/133549. A patch is posted http://thread.gmane.org/gmane.linux.usb.general/134707.
For 0.1, we can document some mitigation measures, but it's too late to integrate this feature.
0.2 can probably use this zero-copy feature to implement a much more efficient transfer pool, also discussed in #447.
I've been using cat /proc/buddyinfo
to monitor free blocks.
I've attempted the following at mitigation, to little or no effect.
echo 1 > /proc/sys/vm/zone_reclaim_mode
echo 1 > /proc/sys/vm/compact_memory
For reference, my system has 4 GB of RAM.
Potential mitigation strategy: increasing min_free_kbytes seems to preserve enough higher order blocks to significantly delay memory allocation failure. My system is able to run 20+ minutes with this "fix".
echo 65536 > /proc/sys/vm/min_free_kbytes
Thanks to @xlz for the fix!
Hi all, we are using libfreenect2 as part of IAI_kinect2 to run RTAB-map. However, after ten minutes of normal operation, we encounter the following error (LIBUSB_DEBUG=3, from iai_kinect2 console), and the depth stream stops.
However, this bug does not occur when running only Protonect (tested for ~1 hour) and also does not occur when running IAI_kinect2 alone.
We have reproduced this on both the 4.2 and 4.3.3 kernel running Ubuntu 14.04, ROS Indigo.
dmesg output:
lspci -nn output:
Thanks to @xlz for helping so far, any addional assistance would be greatly appreciated!