dorimanx / exfat-nofuse

Android ARM Linux non-fuse read/write kernel driver for exFat and VFat Android file systems
GNU General Public License v2.0
710 stars 326 forks source link

readdir Memory issues, missing files/directories, kernel trace (4.2.0-42 ubuntu 14.04.1) #87

Open mandree opened 8 years ago

mandree commented 8 years ago

Greetings,

I have a 64 GB SDXC card here (SanDisk) here that was written to with a Sony camera, Linux, and Windows. exfatfsck complains about 0-clusters in a certain directory, but Windows 7 SP1's "chkdsk I: /F" does not find anything worthy of repair or complaint.

Diffing the output of Cygwin find under Windows 7, and a find with exfat-nofuse kernel module yields several differences, one of them

I am getting this kernel log show below from "find". Unmounting the file system is not possible, umount hangs, and cannot be killed with SIGKILL. Trying to "rmmod -f exfat" also gets refused.

exfat-fuse (a different) project behaves in a similar way, but it does not show a broken file.

diff between cygwin find under Windows 7 and Linux find, only relevant parts shown.

 ./DCIM/12760612/DSC02480.ARW
 ./DCIM/12760612/DSC02480.JPG
-./DCIM/12760612/DSC02481.ARW
+./DCIM/12760612/DSC02481.A
 ./DCIM/12760612/DSC02481.JPG
-./DCIM/12760612/DSC02482.ARW
-./DCIM/12760612/DSC02482.JPG
-./DCIM/12760612/DSC02483.ARW
-./DCIM/12760612/DSC02483.JPG
-./DCIM/12760612/DSC02484.ARW
...
-./DCIM/12760612/DSC02566.ARW
-./DCIM/12760612/DSC02566.JPG
-./DCIM/12960617
-./DCIM/12960617/DSC02578.ARW
-./DCIM/12960617/DSC02578.JPG
...
 ./DCIM/13360622/DSC02721.JPG
-./DCIM/13460623
 ./DCIM/13560709
...
-./DCIM/13960716/DSC03078.ARW
-./DCIM/13960716/DSC03078.JPG
-./MP_ROOT
-./MP_ROOT/100ANV01
-./MP_ROOT/101ANV01
...
 ./PRIVATE
 ./PRIVATE/AVCHD
 ./PRIVATE/AVCHD/BDMV

kern.log:

divide error: 0000 [#1] SMP 
Modules linked in: [...elided...]
CPU: 1 PID: 8638 Comm: find Tainted: G           OE   4.2.0-42-generic #49~14.04.1-Ubuntu
Hardware name: [...elided...]
task: ffff8800a463cb00 ti: ffff8802c7f94000 task.ti: ffff8802c7f94000
RIP: 0010:[<ffffffffc0a8ad37>]  [<ffffffffc0a8ad37>] ffsReadDir+0x117/0x5c0 [exfat]
RSP: 0018:ffff8802c7f97808  EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff8802f1738000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8802c7f97b00 RDI: ffff880275240380
RBP: ffff8802c7f97a88 R08: 0000000000000000 R09: 0000000000000000
R10: 00007efe2ad167b8 R11: 0000000000000000 R12: ffff8802c7f97b00
R13: ffff8802c9050000 R14: 0000000000000000 R15: ffff88029d385400
FS:  00007efe2b1e2740(0000) GS:ffff88031fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f69c6ba100f CR3: 000000029d5e0000 CR4: 00000000000006e0
Stack:
 ffffffffc9054200 ffff880275240380 ffff880200000000 0000000100000000
 ffffffffc0a89e6b 0000000000000000 00000000c7f97801 0000000100000000
 00000000000001ff ffff8802c7f97914 ffff8802b97508a0 ffff8802c9050000
Call Trace:
 [<ffffffffc0a89e6b>] ? get_entry_set_in_dir+0xfb/0x2f0 [exfat]
 [<ffffffffc0a89eb8>] ? get_entry_set_in_dir+0x148/0x2f0 [exfat]
 [<ffffffffc0a8c054>] ? ffsLookupFile+0x1c4/0x290 [exfat]
 [<ffffffff81200000>] ? SyS_fcntl+0x1a0/0x5e0
 [<ffffffff81207834>] ? inode_init_once+0xc4/0x120
 [<ffffffffc0a90f35>] ? init_once+0x25/0x30 [exfat]
 [<ffffffff811d0512>] ? new_slab+0x382/0x450
 [<ffffffffc0a90a4f>] ? exfat_alloc_inode+0x1f/0x50 [exfat]
 [<ffffffff810c08ce>] ? down+0x2e/0x50
 [<ffffffffc0a93a37>] FsReadDir+0x47/0x60 [exfat]
 [<ffffffffc0a915aa>] exfat_readdir+0x13a/0x3e0 [exfat]
 [<ffffffff812015d0>] ? fillonedir+0xd0/0xd0
 [<ffffffff81204ca5>] ? __d_instantiate+0x95/0xf0
 [<ffffffff8120408c>] ? d_rehash+0x4c/0x60
 [<ffffffff812059ac>] ? d_splice_alias+0xcc/0x2a0
 [<ffffffffc0a91cf2>] ? exfat_lookup+0x72/0x1e0 [exfat]
 [<ffffffff81311ed3>] ? security_file_open+0x93/0xa0
 [<ffffffff811ec0bb>] ? do_dentry_open+0x28b/0x320
 [<ffffffff8120da44>] ? mntput+0x24/0x40
 [<ffffffff811f833e>] ? terminate_walk+0x6e/0xe0
 [<ffffffff811fc2c5>] ? path_openat+0x645/0x1330
 [<ffffffff811fd12f>] ? putname+0x5f/0x70
 [<ffffffff811fe08e>] ? do_filp_open+0x8e/0xd0
 [<ffffffff8120147a>] iterate_dir+0x9a/0x120
 [<ffffffff812018e1>] SyS_getdents+0x81/0xe0
 [<ffffffff812015d0>] ? fillonedir+0xd0/0xd0
 [<ffffffff817c36f2>] entry_SYSCALL_64_fastpath+0x16/0x75
Code: b5 8c 00 00 00 45 85 f6 0f 85 ae 04 00 00 44 8b 9d a8 fd ff ff 45 85 db 0f 85 d2 03 00 00 8b 85 9c fd ff ff 8b 8d 90 fd ff ff 99 <f7> f9 39 ca 41 89 d7 0f 8d 84 04 00 00 48 8d 8d a4 fd ff ff 48 
RIP  [<ffffffffc0a8ad37>] ffsReadDir+0x117/0x5c0 [exfat]
 RSP <ffff8802c7f97808>
---[ end trace 02838811ee369c88 ]---
mandree commented 8 years ago

Note that exfat-fuse (see reference above) sees a few more directories than exfat-nofuse, diff -u between sorted "find" results:

--- filelist-exfat-nofuse.txt   2016-07-18 20:32:20.900120100 +0200
+++ filelist-exfat-fuse.txt 2016-07-18 22:53:32.343360700 +0200
@@ -2311,8 +2311,108 @@
 ./DCIM/12760612/DSC02479.JPG
 ./DCIM/12760612/DSC02480.ARW
 ./DCIM/12760612/DSC02480.JPG
-./DCIM/12760612/DSC02481.A
+./DCIM/12760612/DSC02481.ARW
 ./DCIM/12760612/DSC02481.JPG
+./DCIM/12760612/DSC02482.ARW
[...]
+./DCIM/12760612/DSC02566.ARW
+./DCIM/12760612/DSC02566.JPG
 ./DCIM/13160619
 ./DCIM/13160619/DSC02605.ARW
 ./DCIM/13160619/DSC02605.ARW.xmp
@@ -2410,6 +2510,7 @@
 ./DCIM/13360622/DSC02717.JPG
 ./DCIM/13360622/DSC02719.JPG
 ./DCIM/13360622/DSC02721.JPG
+./DCIM/13460623
 ./DCIM/13560709
 ./DCIM/13560709/DSC02725.ARW
 ./DCIM/13560709/DSC02725.JPG
dorimanx commented 8 years ago

Possible missing changes to code to fully support linux 4.x.x If any dev with extended knowledge of 4.x.x filesystem changes from 3.x.x like to help with debug and fixes. i will be happy to merge changes. Otherwise consider this driver as not fully stable with latest linux kernel builds.

mandree commented 8 years ago

I think there are several issues I might have reported separately: One is an issue with the interpretation of the file system contents (that chkdsk and Windows 7 appear to find fair enough but that may be corrupted enough for Linux), and the other one with the module interface that prevents the forced kill of the umount or the forced module unload.

Please also see https://github.com/relan/exfat/issues/41 and https://github.com/relan/exfat/issues/42 which stem from the same file system about its interpretation.

mandree commented 7 years ago

So, exfat-fuse v1.2.5 has fixed these issues. How about the kernel driver?

dorimanx commented 7 years ago

This is the fix made in exfat-fuse binary... https://github.com/relan/exfat/commit/575ba4bca69096d5c795f466f2fdd4600a36fd4f can any one help to adapt this check to kernel driver? idea is to check if dir has valid start cluster, and see if empty files dont have any clusters.