HarryWei / cloudxy

Automatically exported from code.google.com/p/cloudxy
6 stars 3 forks source link

test_hlfs_tree_snapshots error #11

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
bug概述
=======
今天测试构建树形快照,但是在构建树枝时,出现了错误,
目前还不是很清除根源,正在调查。

恢复bug
=======
1, 下载snapshot分支     
svn checkout http://cloudxy.googlecode.com/svn/branches/snapshot/ hlfs
2,  cd hlfs/build && cmake ../src && make all
3,  cd ../src/snapshot/unittest/build && cmake ..
4,  make all
5,  ./test_hlfs_tree_snapshots

测试源码
========
http://cloudxy.googlecode.com/svn/branches/snapshot/src/snapshot/unittest/test_h
lfs_tree_snapshots.c

执行结果
=========
期望结果
---------------
一切正常,断言没有执行

实际结果
--------------
jiawei@jiawei-laptop:~/workshop15/snapshot/src/snapshot/unittest/build$ 
./test_hlfs_tree_snapshots 
/misc/hlfs_take_snapshot: test env dir is 
/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build
uri is 
local:///home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs
** Message: cmd is [../mkfs.hlfs -u 
local:///home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs -b 
8192 -s 67108864 -m 1024]
** Message: key file data :
[METADATA]
uri=local:///home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs
block_size=8192
segment_size=67108864
max_fs_size=1024

** Message: sb file path 
/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: sb file path 
1/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: sb file path 
2/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: append size:148
fixture->uri is 
local:///home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs
** Message: get_current_time is 1326271479821
** Message: length is 4096
** Message: get_current_time is 1326271479873
** Message: length is 8192
** Message: 1 buffer is [T1]
** Message: get_current_time is 1326271479919
** Message: length is 12288
** Message: 2 buffer is [T2]
** Message: get_current_time is 1326271479963
** Message: length is 16384
** Message: 3 buffer is [T3]
** Message: get_current_time is 1326271480007
** Message: length is 20480
** Message: 4 buffer is [T4]
** Message: get_current_time is 1326271480052
** Message: length is 24576
** Message: 5 buffer is [T5]
** Message: get_current_time is 1326271480109
** Message: length is 28672
** Message: 6 buffer is [T6]
** Message: get_current_time is 1326271480154
** Message: length is 32768
** Message: 7 buffer is [T7]
** Message: get_current_time is 1326271480209
** Message: length is 36864
** Message: 8 buffer is [T8]
** Message: get_current_time is 1326271480254
** Message: length is 40960
** Message: 9 buffer is [T9]
** Message: get_current_time is 1326271480298
** Message: length is 45056
** Message: 10 buffer is [T10]
** Message: get_current_time is 1326271481349
** Message: length is 24576
** Message: get_current_time is 1326271481389
** Message: length is 24576
** Message: 1 buffer is [T11]
**
ERROR:/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snap
shots.c:219:test_hlfs_tree_snapshots: assertion failed (ret1 == REQ_SIZE): (-1 
== 4096)
已放弃

bug分析
=======
这里给出部分log信息,如下:
20120111 08:44:41.423 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/backend/local_storage.c][local_file_pread]
[137]local -- leave func local_file_pread
20120111 08:44:41.423 ERROR    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/misc.c][read_block][104]bs_file_prea
d's size is not equal to block_size
20120111 08:44:41.423 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/backend/local_storage.c][local_file_close]
[94]local -- enter func local_file_close
20120111 08:44:41.423 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/backend/local_storage.c][local_file_close]
[98]local -- leave func local_file_close
20120111 08:44:41.423 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/misc.c][read_block][112]leave func 
read_block
20120111 08:44:41.424 ERROR    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][load_block_by_no][228]can
 not read block for storage address 25308
20120111 08:44:41.424 ERROR    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/storage/hlfs_write.c][hlfs_write][70]can 
not read logic block: 1

read_block源码如下:
char *read_block(struct back_storage *storage ,uint64_t 
storage_address,uint32_t block_size)
{
HLOG_DEBUG("enter func %s", __func__);
    int ret = 0;
    uint32_t offset = get_offset(storage_address); 
    const char segfile_name[SEGMENT_FILE_NAME_MAX];
    ret = build_segfile_name(get_segno(storage_address), (char *) segfile_name);
    if (-1 == ret) {
   HLOG_ERROR("build_segfile_name error");
   return NULL;
    }

    bs_file_t file = storage->bs_file_open(storage,segfile_name,BS_READONLY); 
    if(file==NULL){
        HLOG_ERROR("can not open segment file %s",segfile_name);
        return NULL;
    }

    char * block = (char*)g_malloc0(block_size);
    if (NULL == block) {
   HLOG_ERROR("Allocate Error!");
   block = NULL;
   goto out;
    }
    if(block_size != storage->bs_file_pread(storage,file,block,block_size,offset)){
   HLOG_ERROR("bs_file_pread's size is not equal to block_size");
        g_free(block);
        block = NULL;
        goto out;
    }

out:
    storage->bs_file_close(storage,file);
HLOG_DEBUG("leave func %s", __func__);
    return block;
}

显然读取这个seg文件时出现了问题,没有读出来block_size大小�
��导致错误发生。

bug目前状态
==========
这个bug目前还未修复,我会尽快找到原因并修复,欢迎感兴��
�的一起进行修复  ;-)

Original issue reported on code.google.com by harryxi...@gmail.com on 11 Jan 2012 at 12:26

GoogleCodeExporter commented 9 years ago
具体地方找到了,我们的append_log有bug,当时,我们设计有点�
��忽,以下这种情况,append_log
就会出现问题。
假设,我们追加了10次log(每次追加4096B), 
并且打了10个快照(T1---T10),然后我们卸载hlfs. 
我们启动hlfs,回滚到T5处,
然后继续追加log并相应的进行快照,当进行到T12就会出现问��
�,具体分析如下:

测试源码
========
 http://cloudxy.googlecode.com/svn/branches/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots.c

执行结果
=========
预期结果
--------------
无断言,一切正常

实际结果
---------------
jiawei@jiawei-laptop:~/workshop15/snapshot/src/snapshot/unittest/build$ 
./test_hlfs_tree_snapshots 
/misc/hlfs_take_snapshot: ** Message: key file data :
[METADATA]
uri=local:///home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs
block_size=8192
segment_size=67108864
max_fs_size=1024

** Message: sb file path 
/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: sb file path 
1/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: sb file path 
2/home/jiawei/workshop15/snapshot/src/snapshot/unittest/build/testfs/superblock
** Message: append size:148
**
ERROR:/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snap
shots.c:248:test_hlfs_tree_snapshots: assertion failed (ret1 == REQ_SIZE): (-1 
== 4096)
已放弃

bug分析
=========
给出部分log
[....]
20120111 18:37:10.491 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][436]entry 
func: append_log
20120111 18:37:10.491 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][ib_amount][78]ib_amount 
db_start is 0
20120111 18:37:10.491 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][ib_amount][79]ib_amount 
db_end is 0
20120111 18:37:10.491 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][ib_amount][106]ib_amount 
we need 0 ibs
20120111 18:37:10.491 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][466] 
db_data_len:8192 ib_data_len:0 BLOCKSIZE:8192
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][471] 
db_cur_no:0 db_offset:48
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][473] is 
level1 -- db_cur_no:0 db_offset:48
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][476] idx:0
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][set_segno][19]harry dbg 
*address is 8468
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][set_segno][20]harry dbg 
segno is 0
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/utils/address.c][set_segno][24]harry dbg 
after set segno *address is 8468
20120111 18:37:10.492 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/common/logger.c][append_log][479]inode.blo
ck[0]'s storage addr:48
[......]

以下是T5快照对应的inode相关信息:
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][237]fixture->ctrl->imap_entry.inode_addr is 
41920, inode_no is 77,          iblock is 0, doubly_iblock is 0, triply_iblock is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[0] is 8468
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[1] is 25308
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[2] is 33728
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[3] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[4] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[5] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[6] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[7] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[8] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[9] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[10] is 0
20120111 18:37:10.490 DEBUG    hlfslog- 
[/home/jiawei/workshop15/snapshot/src/snapshot/unittest/test_hlfs_tree_snapshots
.c][test_hlfs_tree_snapshots][240]fixture->ctrl->inode.blocks[11] is 0

可以看出来T5 快照对应的inode直接索引信息是 blocks[0]是 8468   
blocks[1]是25308  
blocks[2] 是  33728,  但是经过第一次set_offset, blocks[0]变为了 48,  
这就导致了后面
的错误。  

bug目前状态
===========
目前还未想到很好的办法来处理这个bug ;-)

Original comment by harryxi...@gmail.com on 11 Jan 2012 at 7:14

GoogleCodeExporter commented 9 years ago
以上问题都解决了,现在我对以上问题进行总结一下:

1, 最早的测试用例有点问题,应该去掉 
    deinit_hlfs(fixture->ctrl);
    fixture->ctrl = NULL;
    fixture->ctrl = init_hlfs(fixture->uri);
    g_assert(fixture->ctrl != NULL);
因为我们卸载hlfs,只需要hlfs_close, 重新挂载
我们只需进行 hlfs_open_by_inode。所以不需要
denit_hlfs 和 init_hlfs。

2, hlfs_close 中关闭了相应的 cur_write_file 和 cur_read_file
相应句柄也应置为 NULL

3,  hlfs_close 中把 后台县城的 "阀门" 关了, hlfs_open_by_inode
需要进行开 “阀门”,具体可参见代码。

以上bug全部修复了,详见diffs, 目前可以正常构建树形快照了 
 ;-)

Original comment by harryxi...@gmail.com on 12 Jan 2012 at 2:18