maharmstone / btrfs

WinBtrfs - an open-source btrfs driver for Windows
GNU Lesser General Public License v3.0
5.75k stars 222 forks source link

Btrfs (1.8.1) disks are write-protected in Windows 11 22621.963 #546

Open Kenst1092 opened 1 year ago

Kenst1092 commented 1 year ago

The issue happened really suddenly while playing video games on Steam with my friends. I downloaded 150 GB worth games over the course of two days, in two separate Windows sessions. Those went through flawlessly. Ran and played those games, all without problems.

Today, upon booting into a new Windows session, games ran fine for the first few hours, and then when my friends and I wanted to play a game which happens to be on the drive where I downloaded those large games on to, I get an error saying that the games could not write to disk, as it is write-protected. Steam however seemed to launch fine when not starting as admin.

Currently, I'm installing a live image of Linux to my USB drive so I can run btrfs check

Mitigations I've tried (will update as I go along): -Start Steam as Admin - Steam refuses to start, saying disk is write-protected -Windows SID was inconsistent with registry mappings, I readded correct mappings after getting my Windows user SID, and restarted - issue still persists

My Windows: -Secure Boot ON (using registry hack to let WinBtrfs run) -TPM 2.0 ON Because I also play Valorant and the anticheat requires those be enabled to be compliant on Windows 11.

EDIT: Details of what I was doing before the issue occurred:

Kenst1092 commented 1 year ago

Added some more diagnostics with the aid of someone fairly well-versed in Btrfs. I have two M.2 NVMe SSDs that are in Btrfs, and both of which I downloaded the games on to.

Here are the btrfs checks from both of them. Doing troubleshooting on a live image of Ubuntu 22.10

SSD 1:

sudo btrfs check /dev/nvme0n1p1
Opening filesystem to check...
Checking filesystem on /dev/nvme0n1p1
UUID: 17adbaca-03ee-47e4-962b-792fe66539fb
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space tree
there is no free space entry for 95593422848-95593431040
cache appears valid but isn't 94519689216
there is no free space entry for 97243848704-97243852800
cache appears valid but isn't 96667172864
there is no free space entry for 108478304256-108478308352
cache appears valid but isn't 107404591104
there is no free space entry for 121357385728-121357389824
cache appears valid but isn't 120289492992
there is no free space entry for 169556287488-169556291584
cache appears valid but isn't 168607875072
free space info recorded 3 extents, counted 2
there is no free space entry for 183640219648-183640223744
cache appears valid but isn't 182566518784
free space info recorded 3 extents, counted 2
there is no free space entry for 187909091328-187909095424
cache appears valid but isn't 186861486080
free space info recorded 6 extents, counted 5
there is no free space entry for 191125733376-191125737472
cache appears valid but isn't 190082711552
free space info recorded 1347 extents, counted 1346
there is no free space entry for 237306470400-237306474496
cache appears valid but isn't 236253609984
free space info recorded 1474 extents, counted 1473
there is no free space entry for 241565446144-241565450240
cache appears valid but isn't 240548577280
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 831689973760 bytes used, error(s) found
total csum bytes: 811463572
total tree bytes: 1990656000
total fs tree bytes: 717783040
total extent tree bytes: 307281920
btree space waste bytes: 329100677
file data blocks allocated: 2872304590848
 referenced 921941880832

SSD 2:

sudo btrfs check /dev/nvme1n1p1
Opening filesystem to check...
Checking filesystem on /dev/nvme1n1p1
UUID: 9e9ac476-e2a5-4455-b4ae-4e1bdc984242
[1/7] checking root items
[2/7] checking extents
ref mismatch on [262806605824 12288] extent item 4294967295, found 0
incorrect local backref count on 262806605824 root 5 owner 260896 offset 0 found 0 wanted 4294967295 back 0x558c0e64a2d0
backref disk bytenr does not match extent record, bytenr=262806605824, ref bytenr=0
backpointer mismatch on [262806605824 12288]
owner ref check failed [262806605824 12288]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
wanted bytes 8192, found 4096 for off 5750611968
cache appears valid but isn't 5399117824
there is no free space entry for 9669193728-9669197824
cache appears valid but isn't 8620343296
there is no free space entry for 32242655232-32242663424
cache appears valid but isn't 31168921600
there is no free space entry for 35463847936-35463852032
cache appears valid but isn't 34390147072
there is no free space entry for 50494169088-50494173184
cache appears valid but isn't 49422532608
wanted bytes 12288, found 4096 for off 119170002944
cache appears valid but isn't 118142009344
free space info recorded 128 extents, counted 127
there is no free space entry for 259875926016-259875930112
cache appears valid but isn't 258802188288
There are still entries left in the space cache
cache appears valid but isn't 262023413760
free space info recorded 9 extents, counted 8
there is no free space entry for 271134842880-271134846976
cache appears valid but isn't 270613348352
there is no free space entry for 280277016576-280277024768
cache appears valid but isn't 279203282944
free space info recorded 1 extents, counted 0
there is no free space entry for 324298514432-324298530816
cache appears valid but isn't 323226697728
wanted bytes 8192, found 4096 for off 327521656832
cache appears valid but isn't 326447923200
free space info recorded 622 extents, counted 621
there is no free space entry for 343611129856-343611133952
cache appears valid but isn't 342554050560
free space info recorded 627 extents, counted 626
there is no free space entry for 344694910976-344694915072
cache appears valid but isn't 343627792384
free space info recorded 924 extents, counted 923
there is no free space entry for 362954711040-362954719232
cache appears valid but isn't 361881403392
there is no free space entry for 367250096128-367250112512
cache appears valid but isn't 366176370688
there is no free space entry for 370470895616-370470899712
cache appears valid but isn't 369397596160
there is no free space entry for 375840014336-375840047104
cache appears valid but isn't 374766305280
there is no free space entry for 377987223552-377987227648
cache appears valid but isn't 376913788928
there is no free space entry for 382282493952-382282498048
cache appears valid but isn't 381208756224
there is no free space entry for 384429957120-384429981696
cache appears valid but isn't 383356239872
there is no free space entry for 387625549824-387625553920
cache appears valid but isn't 386577465344
there is no free space entry for 398388604928-398388625408
cache appears valid but isn't 397314883584
there is no free space entry for 406978519040-406978560000
cache appears valid but isn't 405904818176
there is no free space entry for 410199756800-410199785472
cache appears valid but isn't 409126043648
there is no free space entry for 414494720000-414494752768
cache appears valid but isn't 413421010944
there is no free space entry for 417680494592-417680506880
cache appears valid but isn't 416642236416
there is no free space entry for 423084662784-423084687360
cache appears valid but isn't 422010945536
there is no free space entry for 585218703360-585218711552
cache appears valid but isn't 584145960960
[4/7] checking fs roots
root 5 inode 260896 errors 400, nbytes wrong
ERROR: errors found in fs roots
found 717732524032 bytes used, error(s) found
total csum bytes: 699188632
total tree bytes: 1763147776
total fs tree bytes: 639090688
total extent tree bytes: 287096832
btree space waste bytes: 294812568
file data blocks allocated: 806604296192
 referenced 815071899648
Kenst1092 commented 1 year ago

I seemed to have fixed the issue, with the help of the same person I talked about in my previous post. However, I will leave the issue open nonetheless.

I did btrfs check --repair on both drives, followed by doing scrubs

SSD 1:

ubuntu@ubuntu:~$ sudo btrfs check --repair --progress /dev/nvme0n1p1
enabling repair mode
WARNING:

    Do not use --repair unless you are advised to do so by a developer
    or an experienced user, and then only after having accepted that no
    fsck can successfully repair all types of filesystem corruption. Eg.
    some software or hardware bugs can fatally damage a volume.
    The operation will start in 10 seconds.
    Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/nvme0n1p1
UUID: 17adbaca-03ee-47e4-962b-792fe66539fb
[1/7] checking root items                      (0:00:02 elapsed, 3295443 items checked)
Fixed 0 roots.
super bytes used 832930447360 mismatches actual used 832930283520121509 items checked)
No device size related problem found           (0:00:13 elapsed, 243031 items checked)
[2/7] checking extents                         (0:00:13 elapsed, 243031 items checked)
there is no free space entry for 95593422848-95593431040elapsed)
cache appears valid but isn't 94519689216
there is no free space entry for 97243848704-97243852800
cache appears valid but isn't 96667172864
there is no free space entry for 108478304256-108478308352
cache appears valid but isn't 107404591104
there is no free space entry for 121357385728-121357389824
cache appears valid but isn't 120289492992
there is no free space entry for 169556287488-169556291584
cache appears valid but isn't 168607875072
free space info recorded 3 extents, counted 2
there is no free space entry for 183640219648-183640223744
cache appears valid but isn't 182566518784
free space info recorded 3 extents, counted 2
there is no free space entry for 187909091328-187909095424
cache appears valid but isn't 186861486080
free space info recorded 6 extents, counted 5
there is no free space entry for 191125733376-191125737472
cache appears valid but isn't 190082711552
free space info recorded 1347 extents, counted 1346
there is no free space entry for 237306470400-237306474496
cache appears valid but isn't 236253609984
free space info recorded 1474 extents, counted 1473
there is no free space entry for 241565446144-241565450240
cache appears valid but isn't 240548577280
Clear free space cache v2
free space cache v2 cleared
[3/7] checking free space tree                 (0:00:10 elapsed, 781 items checked)
[4/7] checking fs roots                        (0:00:34 elapsed, 43687 items checked)
[5/7] checking csums (without verifying data)  (0:00:03 elapsed, 414237 items checked)
[6/7] checking root refs                       (0:00:00 elapsed, 3 items checked)
[7/7] checking quota groups skipped (not enabled on this FS)
found 1663285059584 bytes used, no error found
total csum bytes: 1622927144
total tree bytes: 3981524992
total fs tree bytes: 1435566080
total extent tree bytes: 614662144
btree space waste bytes: 658342718
file data blocks allocated: 5744609181696
 referenced 1843883761664

Then mounted the drive to /mnt and did a scrub

ubuntu@ubuntu:~$ sudo mount -o defaults,noatime,clear_cache,space_cache=v2 /dev/nvme0n1p1 /mnt
ubuntu@ubuntu:~$ sudo btrfs scrub start -BdR /mnt

Scrub device /dev/nvme0n1p1 (id 1) done
Scrub started:    Wed Dec 28 20:31:31 2022
Status:           finished
Duration:         0:04:26
    data_extents_scrubbed: 13767857
    tree_extents_scrubbed: 242610
    data_bytes_scrubbed: 830939627520
    tree_bytes_scrubbed: 3974922240
    read_errors: 0
    csum_errors: 0
    verify_errors: 0
    no_csum: 227
    csum_discards: 202865893
    super_errors: 0
    malloc_errors: 0
    uncorrectable_errors: 0
    unverified_errors: 0
    corrected_errors: 0
    last_physical: 910571864064

Did the same with SSD 2:

ubuntu@ubuntu:~$ sudo btrfs check --repair --progress /dev/nvme1n1p1
enabling repair mode
WARNING:

    Do not use --repair unless you are advised to do so by a developer
    or an experienced user, and then only after having accepted that no
    fsck can successfully repair all types of filesystem corruption. Eg.
    some software or hardware bugs can fatally damage a volume.
    The operation will start in 10 seconds.
    Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/nvme1n1p1
UUID: 9e9ac476-e2a5-4455-b4ae-4e1bdc984242
[1/7] checking root items                      (0:00:02 elapsed, 3153531 items checked)
Fixed 0 roots.
ref mismatch on [262806605824 12288] extent item 4294967295, found 081 items checked)
incorrect local backref count on 262806605824 root 5 owner 260896 offset 0 found 0 wanted 4294967295 back 0x55a1821d3f10
backref disk bytenr does not match extent record, bytenr=262806605824, ref bytenr=0
backpointer mismatch on [262806605824 12288]
owner ref check failed [262806605824 12288]
repair deleting extent record: key [262806605824,168,12288]
Repaired extent references for 262806605824
block group [262023413760 1073741824] used 1035018240 but extent items used 1035030528
super bytes used 717732560896 mismatches actual used 717732524032
super bytes used 717732610048 mismatches actual used 717732593664215251 items checked)
super bytes used 717732626432 mismatches actual used 717732642816322882 items checked)
No device size related problem found           (0:00:25 elapsed, 430513 items checked)
[2/7] checking extents                         (0:00:26 elapsed, 430513 items checked)
wanted bytes 8192, found 4096 for off 57506119680:00:00 elapsed)
cache appears valid but isn't 5399117824
there is no free space entry for 9669193728-9669197824
cache appears valid but isn't 8620343296
there is no free space entry for 32242655232-32242663424
cache appears valid but isn't 31168921600
there is no free space entry for 35463847936-35463852032
cache appears valid but isn't 34390147072
there is no free space entry for 50494169088-50494173184
cache appears valid but isn't 49422532608
wanted bytes 12288, found 4096 for off 119170002944
cache appears valid but isn't 118142009344
free space info recorded 128 extents, counted 127
there is no free space entry for 259875926016-259875930112
cache appears valid but isn't 258802188288
free space info recorded 9 extents, counted 8
there is no free space entry for 271134842880-271134846976
cache appears valid but isn't 270613348352
there is no free space entry for 280277016576-280277024768
cache appears valid but isn't 279203282944
free space info recorded 1 extents, counted 0
there is no free space entry for 324298514432-324298530816
cache appears valid but isn't 323226697728
wanted bytes 8192, found 4096 for off 327521656832
cache appears valid but isn't 326447923200
free space info recorded 622 extents, counted 621
there is no free space entry for 343611129856-343611133952
cache appears valid but isn't 342554050560
free space info recorded 627 extents, counted 626
there is no free space entry for 344694910976-344694915072
cache appears valid but isn't 343627792384
free space info recorded 924 extents, counted 923
there is no free space entry for 362954711040-362954719232
cache appears valid but isn't 361881403392
there is no free space entry for 367250096128-367250112512
cache appears valid but isn't 366176370688
there is no free space entry for 370470895616-370470899712
cache appears valid but isn't 369397596160
there is no free space entry for 375840014336-375840047104
cache appears valid but isn't 374766305280
there is no free space entry for 377987223552-377987227648
cache appears valid but isn't 376913788928
there is no free space entry for 382282493952-382282498048
cache appears valid but isn't 381208756224
there is no free space entry for 384429957120-384429981696
cache appears valid but isn't 383356239872
there is no free space entry for 387625549824-387625553920
cache appears valid but isn't 386577465344
there is no free space entry for 398388604928-398388625408
cache appears valid but isn't 397314883584
there is no free space entry for 406978519040-406978560000
cache appears valid but isn't 405904818176
there is no free space entry for 410199756800-410199785472
cache appears valid but isn't 409126043648
there is no free space entry for 414494720000-414494752768
cache appears valid but isn't 413421010944
there is no free space entry for 417680494592-417680506880
cache appears valid but isn't 416642236416
there is no free space entry for 423084662784-423084687360
cache appears valid but isn't 422010945536
there is no free space entry for 585218703360-585218711552
cache appears valid but isn't 584145960960
Clear free space cache v2
free space cache v2 cleared
[3/7] checking free space tree                 (0:00:14 elapsed, 674 items checked)
reset nbytes for ino 260896 root 5             (0:00:34 elapsed, 38466 items checked)
[4/7] checking fs roots                        (0:01:09 elapsed, 77783 items checked)
[5/7] checking csums (without verifying data)  (0:00:03 elapsed, 425734 items checked)
[6/7] checking root refs                       (0:00:00 elapsed, 3 items checked)
[7/7] checking quota groups skipped (not enabled on this FS)
found 2870930403328 bytes used, no error found
total csum bytes: 2796754528
total tree bytes: 7052935168
total fs tree bytes: 2556362752
total extent tree bytes: 1148583936
btree space waste bytes: 1179452713
file data blocks allocated: 3226417184768
 referenced 3260287598592

Scrub

ubuntu@ubuntu:~$ sudo mount -o defaults,noatime,clear_cache,space_cache=v2 /dev/nvme1n1p1 /mnt
ubuntu@ubuntu:~$ sudo btrfs scrub start -BdR /mnt

Scrub device /dev/nvme1n1p1 (id 1) done
Scrub started:    Wed Dec 28 20:17:54 2022
Status:           finished
Duration:         0:06:25
    data_extents_scrubbed: 11930203
    tree_extents_scrubbed: 214776
    data_bytes_scrubbed: 715969363968
    tree_bytes_scrubbed: 3518889984
    read_errors: 0
    csum_errors: 0
    verify_errors: 0
    no_csum: 50
    csum_discards: 174797158
    super_errors: 0
    malloc_errors: 0
    uncorrectable_errors: 0
    unverified_errors: 0
    corrected_errors: 0
    last_physical: 746289364992

Rebooted into Windows, and tried to run a game that complained of its medium being read-only. Game ran perfectly fine. Proceeded to verify integrity of game files, and all files have successfully validated without issue.

I have edited my original post to detail what I have done before the issue started appearing, in an attempt to help you people interested in reproducing the problem.

lesderid commented 1 year ago

I've been able to reproduce this on 1.8.1 on a clean virtual disk (64GiB) in a WinPE (1809, 10.0.17763.1) VM, formatted with mkbtrfs on Windows (and with Compress=1 and CompressType=3) by copying large files (24GiB, compressible down to 16GiB) back and forth.

Specifically I created such a file on NTFS, copied it to the btrfs disk (copy C:\24G D:\1), and then looped the following:

copy D:\1 D:\2
del D:\1
copy D:\2 D:\1
del D:\2
lesderid commented 1 year ago

Last log entries (OCRed, didn't have something set up to get logs out of the VM):

FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687C095C740:add_checksum_entry: find_item returned c000009a
FFFF8687BDAEC080:fsctl_request: unknown control code 94264 (DeviceType = 9, Access = 1, Function = 99, Method = 0)
FFFF8687C095C740:find_item:find_item_in_tree returned c000009a
FFFF8687C095C740:add_checksum_entry: find_item returned c000009a
FFFF8687C095C740:do_flush:do_write returned c000009a

Edit: c000009a seems to suggest it ran out of memory? I'm pretty sure this was caused by WinBtrfs as the copy was the only thing running in the VM.

owovin commented 1 year ago

i think this is similar to #468 and #553, and it seems like its caused by running out of disk space while a transaction is happening (such as compressing, transfering, etc). To me it looks like the reported size by windows is not the same as btrf's "working size" (which seems to be smaller, thus causing readonly at ~700-900mb left instead of 0mb), which causes conflict. the best for now is to make sure to have aleast 5gb of headroom at all times, because this is a complex problem.

maharmstone commented 1 year ago

i think this is similar to #468 and #553, and it seems like its caused by running out of disk space while a transaction is happening (such as compressing, transfering, etc).

As @lesderid points out, the error message is STATUS_INSUFFICIENT_RESOURCES - "resources" here being RAM rather than disk space. My hunch is that something isn't being rolled back cleanly when a function unexpectedly fails due to lack of RAM.

the best for now is to make sure to have aleast 5gb of headroom at all times, because this is a complex problem.

I'd advise this in any case, on both Windows and Linux.

vukitoso commented 9 months ago

is the problem solved now?