Open-CAS / ocf

Open CAS Framework
BSD 3-Clause "New" or "Revised" License
168 stars 82 forks source link

make simple example will fail if I enable the metadata cache hash debug #370

Open gwnet opened 4 years ago

gwnet commented 4 years ago

===if I change OCF_METADATA_HASH_DEBUG as 1.

define OCF_METADATA_HASH_DEBUG 0

if 1 == OCF_METADATA_HASH_DEBUG

define OCF_DEBUG_TRACE(cache) \

ocf_cache_log(cache, log_info, "[Metadata][Hash] %s\n", __func__)

define OCF_DEBUG_PARAM(cache, format, ...) \

ocf_cache_log(cache, log_info, "[Metadata][Hash] %s - "format"\n", \
        __func__, ##__VA_ARGS__)

else

===== then I suffer below build error ==== could you please help? ===== or can you show me to how to understand and debug metadata hash map easily? and also I see there is RB-Tree in the code, what is it for? gcc -g -Wall -Iinclude/ -Isrc//ocf/env/ -c -o src/ocf/eviction/eviction.o src/ocf/eviction/eviction.c gcc -g -Wall -Iinclude/ -Isrc//ocf/env/ -c -o src/ocf/eviction/lru.o src/ocf/eviction/lru.c gcc -g -Wall -Iinclude/ -Isrc//ocf/env/ -c -o src/ocf/metadata/metadata.o src/ocf/metadata/metadata.c gcc -g -Wall -Iinclude/ -Isrc//ocf/env/ -c -o src/ocf/metadata/metadata_collision.o src/ocf/metadata/metadata_collision.c gcc -g -Wall -Iinclude/ -Isrc//ocf/env/ -c -o src/ocf/metadata/metadata_hash.o src/ocf/metadata/metadata_hash.c src/ocf/metadata/metadata_hash.c:2:8: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘-’ token

Connection closed by foreign host.

Disconnected from remote host(10.239.70.152-lab) at 20:40:05.

Type `help' to learn how to use Xshell prompt. [C:\~]$

gwnet commented 4 years ago

I have fixed the build error, it is my type error. but after I enable metadata cache hash debug. I suffer the core crash from simple example. it is very easy to reproduce, could you please help?

include Makefile simple src [root@localhost simple]# ./simple Inserting cache cache1 Segmentation fault (core dumped)

robertbaldyga commented 4 years ago

Thank you for reporting this problem. We will investigate this.

gwnet commented 4 years ago

thank you so much. here is stact trace for the core dump. Program received signal SIGSEGV, Segmentation fault. __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62 62 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory. (gdb) bt

0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:62

1 0x00007ffff76054d3 in _IO_vfprintf_internal (s=0x7ffff7994760 <_IO_2_1_stdout_>, format=0x5555555a6b07 "%s: [Metadata][Hash] %s\n", ap=0x7fffffffdba0) at vfprintf.c:1643

2 0x0000555555558ad4 in ctx_logger_print (logger=0x5555557c3b40, lvl=log_info, fmt=0x5555555a6b07 "%s: [Metadata][Hash] %s\n", args=0x7fffffffdba0) at src/ctx.c:203

3 0x0000555555564319 in ocf_log_raw (logger=0x5555557c3b40, lvl=log_info, fmt=0x5555555a6b07 "%s: [Metadata][Hash] %s\n") at src/ocf/ocf_logger.c:25

4 0x0000555555587f26 in ocf_metadata_hash_init (cache=0x7ffff2c4f010, cache_line_size=ocf_cache_line_size_4) at src/ocf/metadata/metadata_hash.c:533

5 0x000055555558351a in ocf_metadata_init (cache=0x7ffff2c4f010, cache_line_size=ocf_cache_line_size_4) at src/ocf/metadata/metadata.c:36

6 0x000055555559afca in _ocf_mngt_cache_start (ctx=0x5555557c3b30, cache=0x7fffffffdf10, cfg=0x7fffffffde70) at src/ocf/mngt/ocf_mngt_cache.c:1216

7 0x000055555559c599 in ocf_mngt_cache_start (ctx=0x5555557c3b30, cache=0x7fffffffdf10, cfg=0x7fffffffde70) at src/ocf/mngt/ocf_mngt_cache.c:1886

8 0x0000555555558178 in initialize_cache (ctx=0x5555557c3b30, cache=0x7fffffffdf10) at src/main.c:127

9 0x0000555555558721 in main (argc=1, argv=0x7fffffffe018) at src/main.c:353

gwnet commented 4 years ago

and also, I cannot understand the metadata layout design detail. what is the seq mode what is the stripping mode. is there any debug message to help me understand better for these?

robertbaldyga commented 4 years ago

@gwnet The difference between seq and stripping mode is that in the seq mode the metadata for consecutive cache lines is written under consecutive addresses, while in stripping mode metadata for consecutive cache lines is written on consecutive pages.

For example in seq mode if metadata record size is 32 bytes, then metadata for cache lines 0, 1 and 2 will be written at offsets 0, 32 and 64 respectively. On the other hand in stripping mode, assuming metadata record size of 32 bytes and page size of 4096 bytes, metadata for cache lines 0, 1 and 2 will be written at offsets 0, 4096 and 8192 respectively. Then after reaching end of metadata section it starts from beginning shifted by one record, e.g 32, 4128, 8224... effectively forming "stripes" across all the pages in metadata section.

Seq:

+---------+---------+---------+---------+
| Page 1  | Page 2  | Page 3  | Page 4  |
+---------+---------+---------+---------+
| line 1  | line 5  | line 9  | line 13 |
| line 2  | line 6  | line 10 | line 14 |
| line 3  | line 7  | line 11 | line 15 |
| line 4  | line 8  | line 12 | line 16 |
+---------+---------+---------+---------+

Stripping:

+---------+---------+---------+---------+
| Page 1  | Page 2  | Page 3  | Page 4  |
+---------+---------+---------+---------+
| line 1  | line 2  | line 3  | line 4  | <- stripe 1
| line 5  | line 6  | line 7  | line 8  | <- stripe 2
| line 9  | line 10 | line 11 | line 12 | <- stripe 3
| line 13 | line 14 | line 15 | line 16 | <- stripe 4
+---------+---------+---------+---------+
gwnet commented 4 years ago

thank you so much!~ I can understand your example now. but what is benefit of seq and stripping, how we pick seq or stripping based on the real scenario?

robertbaldyga commented 4 years ago

@gwnet Based on empirical observations, using stripping can improve performance on some types of NAND drives.

gwnet commented 4 years ago

@robertbaldyga Thank you so much. is the reason because NAND erase before write policy is different for different vendor? so some driver maybe seq layout is better? some drives stripping is better?

robertbaldyga commented 4 years ago

@gwnet Regarding to erase before write policy differences - that might be the case. In general I think that for most drives stripping will be better or as good as seq layout. However seq layout has advantage on in-memory drives (like ramdisks) because it promotes TLB locality.

gwnet commented 4 years ago

thank you so much. I got it. :) I have another perf issue, I will open new issue. :)

karolinavelkaja commented 2 years ago

Initial issue: debug macros, to be resolved with P3.