open-power-host-os / qemu

OpenPOWER Host OS qemu repository
Other
2 stars 3 forks source link

qemu-io crashes with SIGABRT and Assertion `c->entries[i].offset != 0' failed #25

Closed nasastry closed 6 years ago

nasastry commented 6 years ago
Mirrored with LTC bug https://bugzilla.linux.ibm.com/show_bug.cgi?id=160749 Re-production steps: 1. Copy the attached files named `backing_img.file.txt and test.img.txt` to a directory 2. Rename them as ``` mv backing_img.file.txt backing_img.file mv test.img.txt test.img ``` P.S. with filename extension as `.file and .img`, they are not getting attached here to changed to `.txt`. 3. And customize the following command to point to the above directory and run the same. `/usr/bin/qemu-io /test.img -c "write 1352192 1707520"` Output of the above command. ``` qemu-io: block/qcow2-cache.c:411: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed. Aborted (core dumped) ``` from gdb: ``` (gdb) bt #0 0x00007fff9d29eff0 in raise () from /lib64/libc.so.6 #1 0x00007fff9d2a136c in abort () from /lib64/libc.so.6 #2 0x00007fff9d294c44 in __assert_fail_base () from /lib64/libc.so.6 #3 0x00007fff9d294d34 in __assert_fail () from /lib64/libc.so.6 #4 0x000000013f520ca8 in qcow2_cache_entry_mark_dirty (bs=, c=, table=) at block/qcow2-cache.c:411 #5 0x000000013f5161a0 in alloc_refcount_block (refcount_block=0x7fff9996f798, cluster_index=2, bs=0x153a8e7a0) at block/qcow2-refcount.c:416 #6 update_refcount (bs=0x153a8e7a0, offset=1048576, length=, addend=1, decrease=false, type=QCOW2_DISCARD_NEVER) at block/qcow2-refcount.c:833 #7 0x000000013f516b90 in qcow2_alloc_clusters_at (bs=0x153a8e7a0, offset=1048576, nb_clusters=1) at block/qcow2-refcount.c:1015 #8 0x000000013f51cda4 in do_alloc_cluster_offset (nb_clusters=, host_offset=, guest_offset=2097152, bs=0x153a8e7a0) at block/qcow2-cluster.c:1173 #9 handle_alloc (m=, bytes=, host_offset=, guest_offset=2097152, bs=0x153a8e7a0) at block/qcow2-cluster.c:1276 #10 qcow2_alloc_cluster_offset (bs=0x153a8e7a0, offset=1572864, bytes=0x7fff9996faac, host_offset=0x7fff9996fab0, m=0x7fff9996fab8) at block/qcow2-cluster.c:1463 #11 0x000000013f50d588 in qcow2_co_pwritev (bs=0x153a8e7a0, offset=1572864, bytes=1486848, qiov=0x7fffea17ab20, flags=) at block/qcow2.c:1933 #12 0x000000013f54ac20 in bdrv_driver_pwritev (bs=, offset=, bytes=, qiov=, flags=16) at block/io.c:877 #13 0x000000013f54c580 in bdrv_aligned_pwritev (req=0x7fff9996fd48, offset=1352192, bytes=1707520, align=1, qiov=0x7fffea17ab20, flags=, child=0x153a9ff20, child=0x153a9ff20) at block/io.c:1382 #14 0x000000013f54d46c in bdrv_co_pwritev (child=0x153a9ff20, offset=1352192, bytes=, qiov=0x7fffea17ab20, flags=) at block/io.c:1633 #15 0x000000013f537808 in blk_co_pwritev (blk=0x153a7e4a0, offset=1352192, bytes=, qiov=0x7fffea17ab20, flags=BDRV_REQ_FUA) at block/block-backend.c:1083 #16 0x000000013f537944 in blk_write_entry (opaque=0x7fffea17ab38) at block/block-backend.c:1108 #17 0x000000013f60bc38 in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:79 #18 0x00007fff9d2b2b9c in makecontext () from /lib64/libc.so.6 #19 0x0000000000000000 in ?? () (gdb) bt full #0 0x00007fff9d29eff0 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00007fff9d2a136c in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x00007fff9d294c44 in __assert_fail_base () from /lib64/libc.so.6 No symbol table info available. #3 0x00007fff9d294d34 in __assert_fail () from /lib64/libc.so.6 No symbol table info available. #4 0x000000013f520ca8 in qcow2_cache_entry_mark_dirty (bs=, c=, table=) at block/qcow2-cache.c:411 i = __PRETTY_FUNCTION__ = "qcow2_cache_entry_mark_dirty" #5 0x000000013f5161a0 in alloc_refcount_block (refcount_block=0x7fff9996f798, cluster_index=2, bs=0x153a8e7a0) at block/qcow2-refcount.c:416 s = 0x153a9aa90 refcount_table_index = 0 ret = 0 meta_offset = new_block = 0 blocks_used = #6 update_refcount (bs=0x153a8e7a0, offset=1048576, length=, addend=1, decrease=false, type=QCOW2_DISCARD_NEVER) at block/qcow2-refcount.c:833 block_index = refcount = cluster_index = 2 table_index = 0 s = 0x153a9aa90 start = last = 1048576 cluster_offset = 1048576 refcount_block = 0x7fff9af20000 old_table_index = ret = #7 0x000000013f516b90 in qcow2_alloc_clusters_at (bs=0x153a8e7a0, offset=1048576, nb_clusters=1) at block/qcow2-refcount.c:1015 s = 0x153a9aa90 cluster_index = refcount = 0 i = 1 ret = __PRETTY_FUNCTION__ = "qcow2_alloc_clusters_at" #8 0x000000013f51cda4 in do_alloc_cluster_offset (nb_clusters=, host_offset=, guest_offset=2097152, bs=0x153a8e7a0) at block/qcow2-cluster.c:1173 ret = s = 0x153a9aa90 #9 handle_alloc (m=, bytes=, host_offset=, guest_offset=2097152, bs=0x153a8e7a0) at block/qcow2-cluster.c:1276 l2_index = 4 keep_old_clusters = false requested_bytes = avail_bytes = nb_bytes = entry = ret = ---Type to continue, or q to quit--- old_m = s = 0x153a9aa90 l2_table = 0x0 nb_clusters = 1 alloc_cluster_offset = 1048576 #10 qcow2_alloc_cluster_offset (bs=0x153a8e7a0, offset=1572864, bytes=0x7fff9996faac, host_offset=0x7fff9996fab0, m=0x7fff9996fab8) at block/qcow2-cluster.c:1463 s = 0x153a9aa90 start = 2097152 remaining = cluster_offset = cur_bytes = 962560 __PRETTY_FUNCTION__ = "qcow2_alloc_cluster_offset" #11 0x000000013f50d588 in qcow2_co_pwritev (bs=0x153a8e7a0, offset=1572864, bytes=1486848, qiov=0x7fffea17ab20, flags=) at block/qcow2.c:1933 s = 0x153a9aa90 offset_in_cluster = 0 ret = cur_bytes = 1486848 cluster_offset = 524288 hd_qiov = {iov = 0x153a57d50, niov = 1, nalloc = 1, size = 220672} bytes_done = 220672 cluster_data = 0x0 l2meta = 0x0 __PRETTY_FUNCTION__ = "qcow2_co_pwritev" #12 0x000000013f54ac20 in bdrv_driver_pwritev (bs=, offset=, bytes=, qiov=, flags=16) at block/io.c:877 drv = sector_num = nb_sectors = ret = __PRETTY_FUNCTION__ = "bdrv_driver_pwritev" #13 0x000000013f54c580 in bdrv_aligned_pwritev (req=0x7fff9996fd48, offset=1352192, bytes=1707520, align=1, qiov=0x7fffea17ab20, flags=, child=0x153a9ff20, child=0x153a9ff20) at block/io.c:1382 bs = 0x153a8e7a0 drv = 0x13f692a10 waited = ret = start_sector = 2641 end_sector = 5976 bytes_remaining = 1707520 max_transfer = #14 0x000000013f54d46c in bdrv_co_pwritev (child=0x153a9ff20, offset=1352192, bytes=, qiov=0x7fffea17ab20, flags=) at block/io.c:1633 bs = 0x153a8e7a0 req = {bs = 0x153a8e7a0, offset = 1352192, bytes = 1707520, type = BDRV_TRACKED_WRITE, serialising = false, overlap_offset = 1352192, overlap_bytes = 1707520, list = {le_next = 0x0, le_prev = 0x153a91a18}, co = 0x153a9f6b0, wait_queue = {entries = {sqh_first = 0x0, sqh_last = 0x7fff9996fd90}}, waiting_for = 0x0} align = 1 head_buf = 0x0 tail_buf = ---Type to continue, or q to quit--- local_qiov = {iov = 0x0, niov = 1062110004, nalloc = 1, size = 0} use_local_qiov = ret = __PRETTY_FUNCTION__ = "bdrv_co_pwritev" #15 0x000000013f537808 in blk_co_pwritev (blk=0x153a7e4a0, offset=1352192, bytes=, qiov=0x7fffea17ab20, flags=BDRV_REQ_FUA) at block/block-backend.c:1083 ret = bs = 0x153a8e7a0 #16 0x000000013f537944 in blk_write_entry (opaque=0x7fffea17ab38) at block/block-backend.c:1108 rwco = 0x7fffea17ab38 #17 0x000000013f60bc38 in coroutine_trampoline (i0=, i1=) at util/coroutine-ucontext.c:79 arg = {p = 0x153a9f6b0, i = {1403647664, 1}} self = 0x153a9f6b0 co = 0x153a9f6b0 #18 0x00007fff9d2b2b9c in makecontext () from /lib64/libc.so.6 No symbol table info available. #19 0x0000000000000000 in ?? () No symbol table info available. ``` Qemu version: `qemu-2.10.0-2.rel.gitc334a4e.el7.centos.ppc64le` Will attach the required image files. [backing_img.file.txt](https://github.com/open-power-host-os/qemu/files/1426609/backing_img.file.txt) [test.img.txt](https://github.com/open-power-host-os/qemu/files/1426610/test.img.txt)
nasastry commented 6 years ago

These image files created by tests/image_fuzzer code from qemu source tree.

cdeadmin commented 6 years ago

------- Comment From muriloo@br.ibm.com 2017-12-27 07:51:15 EDT------- I can't reproduce this bug with HostOS QEMU 2.11.0 (commit e7153e020f) from branch hostos-release. The qemu-io program terminates gracefully:

$ ./qemu-io test.img -c "write 1352192 1707520"
can't open device test.img: Image does not contain a reference count table

Apparently, this was fixed by:

commit 951053a9ec1c47edf4b2549ef58d82aee8a42a7f
Author: Alberto Garcia <berto@igalia.com>
Date:   Fri Nov 3 16:18:53 2017 +0200

    qcow2: Don't open images with header.refcount_table_clusters == 0

This commit checks whether .refcount_table_clusters of the image is equal to 0, terminating the program. So the program terminates before qcow2_cache_entry_mark_dirty() is called and tries to assert(c->entries[i].offset != 0), which caused the segfault reported here:

Thread 1 "qemu-io" hit Breakpoint 1, qcow2_cache_entry_mark_dirty (bs=0x10680ee0, c=0x10662350, table=0x7ffff5f70000) at /home/muriloo/hostos/qemu/block/qcow2-cache.c:410
410     int i = qcow2_cache_get_table_idx(bs, c, table);
(gdb) n
411     assert(c->entries[i].offset != 0);
(gdb) p i
$1 = 0
(gdb) p c->entries[i].offset != 0
$2 = 0
(gdb) n
qemu-io: /home/muriloo/hostos/qemu/block/qcow2-cache.c:411: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed.

Thread 1 "qemu-io" received signal SIGABRT, Aborted.
0x00007ffff79b91d8 in raise () from /lib64/libc.so.6

The commit 951053a9ec is present in both hostos-devel and -release branches.

Cheers Murilo

cdeadmin commented 6 years ago

------- Comment From muriloo@br.ibm.com 2017-12-27 14:13:03 EDT------- The actual commit that fixes this issue is:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=6bf45d59f98c898b7d79

commit 6bf45d59f98c898b7d7997a333765c8ee41236ea Author: Alberto Garcia <berto@igalia.com> Date: Fri Nov 3 16:18:50 2017 +0200

qcow2: Prevent allocating refcount blocks at offset 0
cdeadmin commented 6 years ago

------- Comment From nasastry@in.ibm.com 2018-01-01 01:02:56 EDT------- Tested with qemu-img-2.11.0-1.rel.gite7153e0.el7.centos.ppc64le

/usr/bin/qemu-io /tmp/test.img -c "write 1352192 1707520"

can't open device /tmp/test.img: Image does not contain a reference count table

/usr/bin/qemu-io --version

qemu-io version 2.11.0 Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers

Reported segfault not seen. This bugzilla can be closed.