Open Rashpil93 opened 6 years ago
Hi Rashpil,
Does this library mmap memory at fixed addresses? Please share /proc/$PID/maps contents at the time of crash.
Hi, Dmitry, DPDK used mmap memory MAP_PRIVATE and MAP_ANONYMOUS for hugepages.
These 100100000000-100100200000 rw-s 00000000 00:11 65809 /mnt/huge/rtemap_0
are probably mapped at fixed addresses and conflict with asan/tsan expectations for virtual address space layout.
If the library allows changing the fixed address (perhaps some env var), then it can help.
For tsan we have expectations about address space here:
https://github.com/llvm-mirror/compiler-rt/blob/master/lib/tsan/rtl/tsan_platform.h#L31
Do we have something similar for asan? But try various addresses, something should work.
I was able to change fixed addresses for the DPDK but my app failed with SIGILL. :( Address space for asan: https://github.com/llvm-mirror/compiler-rt/blob/master/lib/asan/asan_allocator.h#L124
What address and tool did you use?
Unless somebody else sees the problem from the provided info, a reproducer would be useful.
In DPPK, you can change the address using the --base-virtaddr option in EAL. I used addresses from 0x200000000000 to 0x700000000000
If add the no_sanitize_address attribute, the application will still fail with SIGILL.
Address space for asan: https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm#64-bit
Perhaps there are other solutions to this problem?
It's still unclear to me why changing address does not work. A reproducer would help sanitizer developers to understand the problem better and hopefully propose some solution.
Hi. Do you need any debug output or an application that repeats an error?
After enabling Asan or Tsan, my app failed with SIGILL. In app used DPDK with hugepages.
After testing, it turned out that the application fail with SIGILL for memory allocated in mbuf pool. If a variable from the stack is assigned a value from arp_hdr->arp_data.arp_tip and pass a pointer to this variable in function ip_format_addr, then the application will work.
Why this happen? How to solve this problem?
Core dump
``` Failed to read a valid object file image from memory. Core was generated by helloworld -l 0-3 -n 1. Program terminated with signal SIGILL, Illegal instruction. #0 0x000000000044611b in ip_format_addr ( buf=Debug dump
``` ASAN_OPTIONS=verbosity=1::disable_coredump=0::unmap_shadow_on_exit=1 helloworld -l 0-3 -n 1 ==390==Parsed ASAN_OPTIONS: verbosity=1::disable_coredump=0::unmap_shadow_on_exit=1 ==390==AddressSanitizer: failed to intercept '__isoc99_printf' ==390==AddressSanitizer: failed to intercept '__isoc99_sprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_snprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_fprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_vprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_vsprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_vsnprintf' ==390==AddressSanitizer: failed to intercept '__isoc99_vfprintf' ==390==AddressSanitizer: failed to intercept 'memcmp' ==390==AddressSanitizer: libc interceptors initialized || `[0x10007fff8000, 0x7fffffffffff]` || HighMem || || `[0x02008fff7000, 0x10007fff7fff]` || HighShadow || || `[0x00008fff7000, 0x02008fff6fff]` || ShadowGap || || `[0x00007fff8000, 0x00008fff6fff]` || LowShadow || || `[0x000000000000, 0x00007fff7fff]` || LowMem || MemToShadow(shadow): 0x00008fff7000 0x000091ff6dff 0x004091ff6e00 0x02008fff6fff redzone=16 max_redzone=2048 quarantine_size=256M malloc_context_size=30 SHADOW_SCALE: 3 SHADOW_GRANULARITY: 8 SHADOW_OFFSET: 7fff8000 ==390==Installed the sigaction for signal 11 ==390==T0: stack [0x7ffe6d5f5000,0x7ffe6ddf5000) size 0x800000; local=0x7ffe6ddf439c ==390==AddressSanitizer Init done EAL: Detected 4 lcore(s) EAL: Probing VFIO support... ==390==T4: stack [0x7f396dde2000,0x7f396e5e1dc0) size 0x7ffdc0; local=0x7f396e5e1cec ==390==T3: stack [0x7f396e5e3000,0x7f396ede2dc0) size 0x7ffdc0; local=0x7f396ede2cec ==390==T2: stack [0x7f396ede4000,0x7f396f5e3dc0) size 0x7ffdc0; local=0x7f396f5e3cec EAL: PCI device 0000:01:00.0 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 8086:1572 net_i40e PMD: Global register is changed during enable FDIR flexible payload PMD: Global register is changed during support QinQ parser PMD: Global register is changed during configure hash input set PMD: Global register is changed during configure fdir mask PMD: Global register is changed during configure hash mask ==390==T1: stack [0x7f396f5e5000,0x7f396fde4dc0) size 0x7ffdc0; local=0x7f396fde4cec PMD: Global register is changed during support QinQ cloud filter PMD: Global register is changed during support TPID configuration EAL: PCI device 0000:01:00.1 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 8086:1572 net_i40e PMD: Global register is changed during enable FDIR flexible payload PMD: Global register is changed during support QinQ parser PMD: Global register is changed during configure hash input set PMD: Global register is changed during configure fdir mask PMD: Global register is changed during configure hash mask PMD: Global register is changed during support QinQ cloud filter PMD: Global register is changed during support TPID configuration EAL: PCI device 0000:01:00.2 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 8086:1572 net_i40e PMD: Global register is changed during enable FDIR flexible payload PMD: Global register is changed during support QinQ parser PMD: Global register is changed during configure hash input set PMD: Global register is changed during configure fdir mask PMD: Global register is changed during configure hash mask PMD: Global register is changed during support QinQ cloud filter PMD: Global register is changed during support TPID configuration EAL: PCI device 0000:01:00.3 on NUMA socket -1 EAL: Invalid NUMA socket, default to 0 EAL: probe driver: 8086:1572 net_i40e PMD: Global register is changed during enable FDIR flexible payload PMD: Global register is changed during support QinQ parser PMD: Global register is changed during configure hash input set PMD: Global register is changed during configure fdir mask PMD: Global register is changed during configure hash mask PMD: Global register is changed during support QinQ cloud filter PMD: Global register is changed during support TPID configuration hello from core 1 ip addr 48.152.125.92 hello from core 2 ip addr 48.152.125.92 hello from core 3 ip addr 48.152.125.92 hello from core 0 ip addr 48.152.125.92 Illegal instruction ```App code
```c static struct rte_mempool *mbuf_pool; void __attribute__ ((noinline)) ip_format_addr(char *buf, uint16_t size, uint32_t *ip) { snprintf(buf, size, "%d.%d.%d.%d", ((uint32_t)(*ip & 0xff)), ((uint32_t)(*ip & 0x0000ff00) >> 8), ((uint32_t)(*ip & 0x00ff0000) >> 16), ((uint32_t)(*ip & 0xff000000) >> 24) ); } static int lcore_hello(__attribute__((unused)) void *arg) { unsigned lcore_id; uint32_t * ip; char buf[32]; struct rte_mbuf *created_pkt; struct ether_hdr *eth_hdr; struct arp_hdr *arp_hdr; size_t pkt_size; lcore_id = rte_lcore_id(); printf("hello from core %u\n", lcore_id); created_pkt = rte_pktmbuf_alloc(mbuf_pool); if (created_pkt == NULL) { printf("Failed to allocate mbuf\n"); return -1; } pkt_size = sizeof(struct ether_hdr) + sizeof(struct arp_hdr); created_pkt->data_len = pkt_size; created_pkt->pkt_len = pkt_size; eth_hdr = rte_pktmbuf_mtod(created_pkt, struct ether_hdr *); eth_hdr->ether_type = rte_cpu_to_be_16(ETHER_TYPE_ARP); arp_hdr = (struct arp_hdr *)((char *)eth_hdr + sizeof(struct ether_hdr)); arp_hdr->arp_hrd = rte_cpu_to_be_16(ARP_HRD_ETHER); arp_hdr->arp_pro = rte_cpu_to_be_16(ETHER_TYPE_IPv4); arp_hdr->arp_hln = ETHER_ADDR_LEN; arp_hdr->arp_pln = sizeof(uint32_t); arp_hdr->arp_op = rte_cpu_to_be_16(ARP_OP_REQUEST); arp_hdr->arp_data.arp_sip = 1551734841; memset(&arp_hdr->arp_data.arp_tha, 0, ETHER_ADDR_LEN); arp_hdr->arp_data.arp_tip = 1551734842; ip = rte_malloc(NULL, sizeof(uint32_t), 0); if (ip == NULL) { printf("Failed to allocate ip\n"); return -1; } *ip = 1551734832; ip_format_addr(buf, 32, ip); printf("ip addr %s\n", buf); ip_format_addr(buf, 32, &arp_hdr->arp_data.arp_tip); printf("ip addr tip %s\n", buf); ip_format_addr(buf, 32, &arp_hdr->arp_data.arp_sip); printf("ip addr sip %s\n", buf); return 0; } int main(int argc, char **argv) { int ret; unsigned lcore_id; ret = rte_eal_init(argc, argv); if (ret < 0) rte_panic("Cannot init EAL\n"); mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL", NB_MBUF, 32, 0, RTE_MBUF_DEFAULT_BUF_SIZE, rte_socket_id()); /* call lcore_hello() on every slave lcore */ RTE_LCORE_FOREACH_SLAVE(lcore_id) { rte_eal_remote_launch(lcore_hello, NULL, lcore_id); } /* call it on master lcore too */ lcore_hello(NULL); rte_eal_mp_wait_lcore(); return 0; } ```