jaegeuk / f2fs-stable

Other
85 stars 21 forks source link

1GB memory usage per HM-SMR zoned disk mounted #9

Open wallentx opened 2 years ago

wallentx commented 2 years ago

I've spent a few days trying to figure out where my memory consumption was going to, and I finally found that for each f2fs formatted zoned disk I mount, 1GB of RAM becomes "used", though I can't find any evidence as to what exactly this is attributed to. I've even mounted the drives as r/o to make sure it wasn't due to some write cache. Is this to be expected? I'm running 5.18.12-arch1-1, and have 42 zoned disks mounted that are formatted with f2fs.

wallentx commented 2 years ago

I've since had to switch to btrfs, but I'd love to switch back to f2fs, since these drives quite buggy with btrfs. I've also had a coworker verify that on his system, with these particular drives (ST14000NM0007-2G), mounting them when they are formatted as f2fs does consume ~1GB of RAM per drive.

With none mounted: 1714a881e301fc4f

With some of them mounted: 1714a88523bb4308

jaegeuk commented 2 years ago

Hi,

May I ask to capture this?

cat /sys/kernel/debug/f2fs/status

Thanks,

On Wed, Sep 14, 2022 at 8:28 AM William Allen @.***> wrote:

I've since had to switch to btrfs, but I'd love to switch back to f2fs, since these drives quite buggy with btrfs. I've also had a coworker verify that on his system, with these particular drives (ST14000NM0007-2G), mounting them when they are formatted as f2fs does consume ~1GB of RAM per drive.

With none mounted: [image: 1714a881e301fc4f] https://user-images.githubusercontent.com/8990544/190085945-ad347e3d-6e8a-44c0-950e-1329849fa904.png

With some of them mounted: [image: 1714a88523bb4308] https://user-images.githubusercontent.com/8990544/190086906-8699b20c-c466-4dd1-8d9b-0cb0043ef3c2.png

— Reply to this email directly, view it on GitHub https://github.com/jaegeuk/f2fs-stable/issues/9#issuecomment-1246357485, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4EPAUTEWFJ4OPPA7LPL2DV6F5BPANCNFSM54SRBD3Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>

wallentx commented 2 years ago

Sorry for the late reply. I had to wait for a drive to become available. Screenshot_20221009-034113

=====[ partition info(sdl). #0, RW, CP: Good]=====
[SB: 1] [CP: 2] [SIT: 476] [NAT: 120] [SSA: 13098] [MAIN: 6662144(OverProv:83685 Resv:42644)]

Current Time Sec: 97552 / Mounted Time Sec: 97454

Utilization: 0% (2 valid blocks, 0 discard blocks)
  - Node: 1 (Inode: 1, Other: 0)
  - Data: 1
  - Inline_xattr Inode: 0
  - Inline_data Inode: 0
  - Inline_dentry Inode: 0
  - Compressed Inode: 0, Blocks: 0
  - Orphan/Append/Update Inode: 0, 0, 0

Main area: 6662144 segs, 52048 secs 52048 zones
    TYPE            segno    secno   zoneno  dirty_seg   full_seg  valid_blk
  - COLD   data:      640        5        5          0          0          0
  - WARM   data:      512        4        4          0          0          0
  - HOT    data:      384        3        3          1          0          1
  - Dir   dnode:        0        0        0          1          0          1
  - File  dnode:      128        1        1          0          0          0
  - Indir nodes:      256        2        2          0          0          0
  - Pinned file:       -1       -1       -1
  - ATGC   data:       -1       -1       -1

  - Valid: 6
  - Dirty: 0
  - Prefree: 0
  - Free: 6662138 (52042)

CP calls: 0 (BG: 0)
  - cp blocks : 0
  - sit blocks : 0
  - nat blocks : 0
  - ssa blocks : 0
CP merge (Queued:    0, Issued:    0, Total:    0, Cur time:    0(ms), Peak time:    0(ms))
GC calls: 0 (BG: 1)
  - data segments : 0 (0)
  - node segments : 0 (0)
  - Reclaimed segs : Normal (0), Idle CB (0), Idle Greedy (0), Idle AT (0), Urgent High (0), Urgent Mid (0), Urgent Low (0)
Try to move 0 blocks (BG: 0)
  - data blocks : 0 (0)
  - node blocks : 0 (0)
BG skip : IO: 0, Other: 0

Extent Cache:
  - Hit Count: L1-1:0 L1-2:0 L2:0
  - Hit Ratio: 0% (0 / 0)
  - Inner Struct Count: tree: 0(0), node: 0

Balancing F2FS Async:
  - DIO (R:    0, W:    0)
  - IO_R (Data:    0, Node:    0, Meta:    0
  - IO_W (CP:    0, Data:    0, Flush: (   0    0    1), Discard: (   0    0)) cmd:    0 undiscard:   0
  - atomic IO:    0 (Max.    0)
  - compress:    0, hit:       0
  - nodes:    0 in    1
  - dents:    0 in dirs:   0 (   0)
  - datas:    0 in files:   0
  - quota datas:    0 in quota files:   0
  - meta:    0 in 121152
  - imeta:    0
  - fsync mark:    0
  - NATs:         0/        0
  - SITs:         0/  6662144
  - free_nids:      3640/ 13977596
  - alloc_nids:         0

Distribution of User Blocks: [ valid | invalid | free ]
  [|-|-------------------------------------------------]

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 0 blocks in 0 segments

BDF: 99, avg. vblocks: 0

Memory: 2056548 KB
  - static: 1571843 KB
  - cached: 93 KB
  - paged : 484612 KB
wallentx commented 2 years ago

I just had a btrfs formatted drive with some data corruption, so I formatted it as f2fs and here is the output of /sys/kernel/debug/f2fs/status after mounting it:

=====[ partition info(sdbn). #0, RW, CP: Good]===== [SB: 1] [CP: 2] [SIT: 476] [NAT: 120] [SSA: 13098] [MAIN: 6662144(OverProv:83685 Resv:42644)]

Current Time Sec: 5908 / Mounted Time Sec: 105

Utilization: 18% (638268078 valid blocks, 0 discard blocks)

  • Node: 627031 (Inode: 26, Other: 627005)
  • Data: 637641047
  • Inline_xattr Inode: 0
  • Inline_data Inode: 0
  • Inline_dentry Inode: 0
  • Compressed Inode: 0, Blocks: 0
  • Orphan/Append/Update Inode: 0, 0, 0

Main area: 6662144 segs, 52048 secs 52048 zones TYPE segno secno zoneno dirty_seg full_seg valid_blk

  • COLD data: 767 5 5 1 127 65533

  • WARM data: 1250958 9773 9773 1 1193486 611065120

  • HOT data: 779714 6091 6091 8 51771 26510394

  • Dir dnode: 0 0 0 1 0 2

  • File dnode: 1252423 9784 9784 2 1222 626381

  • Indir nodes: 258 2 2 3 0 648

  • Pinned file: -1 -1 -1

  • ATGC data: -1 -1 -1

  • Valid: 1246612

  • Dirty: 10

  • Prefree: 0

  • Free: 5415522 (42305)

CP calls: 2 (BG: 24)

  • cp blocks : 16
  • sit blocks : 14
  • nat blocks : 298
  • ssa blocks : 256 CP merge (Queued: 0, Issued: 24, Total: 24, Cur time: 0(ms), Peak time: 1741(ms)) GC calls: 4 (BG: 23)
  • data segments : 0 (0)
  • node segments : 256 (256)
  • Reclaimed segs : Normal (256), Idle CB (0), Idle Greedy (0), Idle AT (0), Urgent High (0), Urgent Mid (0), Urgent Low (0) Try to move 131059 blocks (BG: 131059)
  • data blocks : 0 (0)
  • node blocks : 131059 (131059) BG skip : IO: 0, Other: 0

Extent Cache:

  • Hit Count: L1-1:0 L1-2:0 L2:0
  • Hit Ratio: 0% (0 / 0)
  • Inner Struct Count: tree: 0(0), node: 0

Balancing F2FS Async:

  • DIO (R: 0, W: 0)
  • IO_R (Data: 0, Node: 0, Meta: 0
  • IO_W (CP: 0, Data: 0, Flush: ( 0 0 1), Discard: ( 0 0)) cmd: 0 undiscard: 0
  • atomic IO: 0 (Max. 0)
  • compress: 0, hit: 0
  • nodes: 0 in 131059
  • dents: 0 in dirs: 0 ( 0)
  • datas: 0 in files: 0
  • quota datas: 0 in quota files: 0
  • meta: 0 in 121626
  • imeta: 0
  • fsync mark: 0
  • NATs: 0/ 32539
  • SITs: 0/ 6662144
  • free_nids: 3640/ 13350566
  • alloc_nids: 0

Distribution of User Blocks: [ valid | invalid | free ] [---------|-|----------------------------------------]

IPU: 0 blocks SSR: 0 blocks in 0 segments LFS: 131059 blocks in 256 segments

BDF: 99, avg. vblocks: 29940

Memory: 2583693 KB

  • static: 1571843 KB
  • cached: 1110 KB
  • paged : 1010740 KB

=====[ partition info(sdck). #1, RW, CP: Good]===== [SB: 1] [CP: 2] [SIT: 476] [NAT: 120] [SSA: 13098] [MAIN: 6662144(OverProv:83685 Resv:42644)]

Current Time Sec: 5908 / Mounted Time Sec: 5870

Utilization: 0% (2 valid blocks, 0 discard blocks)

  • Node: 1 (Inode: 1, Other: 0)
  • Data: 1
  • Inline_xattr Inode: 0
  • Inline_data Inode: 0
  • Inline_dentry Inode: 0
  • Compressed Inode: 0, Blocks: 0
  • Orphan/Append/Update Inode: 0, 0, 0

Main area: 6662144 segs, 52048 secs 52048 zones TYPE segno secno zoneno dirty_seg full_seg valid_blk

  • COLD data: 640 5 5 0 0 0

  • WARM data: 512 4 4 0 0 0

  • HOT data: 384 3 3 1 0 1

  • Dir dnode: 0 0 0 1 0 1

  • File dnode: 128 1 1 0 0 0

  • Indir nodes: 256 2 2 0 0 0

  • Pinned file: -1 -1 -1

  • ATGC data: -1 -1 -1

  • Valid: 6

  • Dirty: 0

  • Prefree: 0

  • Free: 6662138 (52042)

CP calls: 0 (BG: 0)

  • cp blocks : 0
  • sit blocks : 0
  • nat blocks : 0
  • ssa blocks : 0 CP merge (Queued: 0, Issued: 0, Total: 0, Cur time: 0(ms), Peak time: 0(ms)) GC calls: 0 (BG: 0)
  • data segments : 0 (0)
  • node segments : 0 (0)
  • Reclaimed segs : Normal (0), Idle CB (0), Idle Greedy (0), Idle AT (0), Urgent High (0), Urgent Mid (0), Urgent Low (0) Try to move 0 blocks (BG: 0)
  • data blocks : 0 (0)
  • node blocks : 0 (0) BG skip : IO: 0, Other: 0

Extent Cache:

  • Hit Count: L1-1:0 L1-2:0 L2:0
  • Hit Ratio: 0% (0 / 0)
  • Inner Struct Count: tree: 0(0), node: 0

Balancing F2FS Async:

  • DIO (R: 0, W: 0)
  • IO_R (Data: 0, Node: 0, Meta: 0
  • IO_W (CP: 0, Data: 0, Flush: ( 0 0 1), Discard: ( 0 0)) cmd: 0 undiscard: 0
  • atomic IO: 0 (Max. 0)
  • compress: 0, hit: 0
  • nodes: 0 in 1
  • dents: 0 in dirs: 0 ( 0)
  • datas: 0 in files: 0
  • quota datas: 0 in quota files: 0
  • meta: 0 in 121152
  • imeta: 0
  • fsync mark: 0
  • NATs: 0/ 1
  • SITs: 0/ 6662144
  • free_nids: 451/ 13977596
  • alloc_nids: 0

Distribution of User Blocks: [ valid | invalid | free ] [|-|-------------------------------------------------]

IPU: 0 blocks SSR: 0 blocks in 0 segments LFS: 0 blocks in 0 segments

BDF: 99, avg. vblocks: 0

Memory: 2056474 KB

  • static: 1571843 KB
  • cached: 18 KB
  • paged : 484612 KB
wallentx commented 1 year ago

@jaegeuk I've been trying to troubleshoot this further, and can only determine that the increased memory usage is being utilized by cache:

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0      0   3287    531   8520    0    0   367   234    0    1  1  0 98  1  0

               total        used        free      shared  buff/cache   available
Mem:            31Gi        19Gi       3.2Gi       438Mi       8.8Gi        11Gi
Swap:             0B          0B          0B

sudo mount /dev/sdcq

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 4  1      0   1439    531   8846    0    0   367   234    0    1  1  0 98  1  0

               total        used        free      shared  buff/cache   available
Mem:            31Gi        20Gi       1.4Gi       438Mi       9.2Gi       9.7Gi
Swap:             0B          0B          0B

These aren't the default mount options, but this is one of many things I've been trying, in order to see if I can narrow down a particular culprit: grep sdcq /proc/mounts

  /dev/sdcq /mnt/jbd1/plots93 
  f2fs
    rw
    lazytime
    noatime
    background_gc=off
    discard                                                                                              
    no_heap                                                                                              
    user_xattr
    inline_xattr
    acl                                                                                                  
    inline_data
    inline_dentry
    flush_merge
    extent_cache
    mode=lfs                                                                                             
    active_logs=2
    alloc_mode=default
    checkpoint_merge
    fsync_mode=posix                                                                                     
    discard_unit=section
    memory=low
  0 0

With the above, only these additional processes are spawned when I mount the HM-SMR volume:

f2fs_ckpt
f2fs_flush
f2fs_discard

I also have a slabtop diff from before and after mounting: https://gist.github.com/wallentx/a5e98371f8cc2c42084b86bbd99253fc#file-slabtop-diff

vinibali commented 1 year ago

I experience somewhat the same with a much smaller partition sizes, on a relatively small NAS with 512MB RAM. In my case memory=low didn't really helped, the 900GiB partition was needed 100MB of memory anyways. Is there a way to go down the line and understand if we can still decrease the memory requirement?

sudo tail -4 /sys/kernel/debug/f2fs/status
Memory: 99637 KB
- static: 98421 KB
- cached: 79 KB
- paged : 1136 KB

side story: https://archlinuxarm.org/forum/viewtopic.php?f=57&t=16534&p=71416#p71416