openzfsonosx / zfs

OpenZFS on OS X
https://openzfsonosx.org/
Other
824 stars 72 forks source link

Copying files with long filenames from HFS+ results in errors #300

Open CharlesJS opened 9 years ago

CharlesJS commented 9 years ago

Currently, ZFS has an issue dealing with certain filenames from HFS+. Basically, HFS+ supports filenames of up to 256 UTF-16 code units, whereas ZFS supports filenames of up to 256 bytes in the UTF-8 encoding. This makes it rather easy to create filenames that are valid in HFS+, but are too large to fit in ZFS. If any such file is on an HFS+ disk that one is trying to back up via rsync to a ZFS volume, that file will fail to copy, and the backup will thus be incomplete.

I propose that, instead of failing to copy the file altogether, ZFS could simply shorten the filename when copying it to the ZFS volume, storing the original filename in an extended attribute. If the filename is later changed on the ZFS volume, the extended attribute could be cleared; otherwise, the original filename could be restored from the extended attribute when copying the file back to an HFS+ volume.

ilovezfs commented 8 years ago

@lundman do we know if this is an (accidentally and artificially) self-imposed restriction, or part of ZFS design itself on illumos, etc.?

lundman commented 8 years ago

It's just how it works, they both handle 256 char names, just the size of char is different meaning hfs has potentially more room due to how utf8 does multibyte encodes. We can truncate as suggested (with unique?), Stuff the remaining of the name some place else (SA? xattr?) and potentially with some more work, handle it transparently in vnop_lookup and vnop_readdir. But the last step changes quite a bit

truncate utf8 string also needs a bunch of logic to pick the right place.

ilovezfs commented 8 years ago

Is there no upstream plan to support UTF-16?

lundman commented 8 years ago

UTF16 was the first, inefficient, encoding system before the cool new UTF8 came out.. that'd be a step backwards :)

ilovezfs commented 8 years ago

So then we can expect Apple to switch over to UTF-8 soon, right?

Truncation sounds like a bad idea. It will break rsync, etc. It seems that either erroring out or handling transparently are the only non-hacky answers, but even handling transparently would still break rsync verification if the other system isn't OS X.

Perhaps users who absolutely need file names in the range ZFS max < names <= HFS max should just be directed to use HFS+ ZVOLs.

Or we could offer up a super-long-name-handling feature to upstream.

ilovezfs commented 8 years ago

I did more research, and it looks like we'll have to accommodate this because Apple does when you copy files from HFS+->FAT. So that is the expected behavior. Luckily, their FAT file system code is open source, so we can see how this has been done before in practice and possibly re-use some of the code. http://opensource.apple.com/source/msdosfs/msdosfs-209.20.1/ https://opensource.apple.com/tarballs/msdosfs/msdosfs-209.20.1.tar.gz

lundman commented 8 years ago

Leaving this as a feature that we might get around to eventually. It seems MSDOS handles long names transparently so that is an option.

ballo commented 8 years ago

(at risk of talking out of my butt...) UTF16 isn't inefficient. In fact, UTF32 could be more efficient given how memory buses work. It uses more memory to store 1960s ASCII text, but so do 8kb pages. I doubt the average ZFS user cares about an extra 256 bytes being consumed when each file is already ginormous compared to decades past.

The human-interface advantage with fat16 is that asian language users know the filename size limits.

All windows FAT formats (exFAT, FAT12/16/32) as well as NTFS support 255 UTF-16 characters. https://en.wikipedia.org/wiki/Comparison_of_file_systems https://en.wikipedia.org/wiki/Long_filename https://en.wikipedia.org/wiki/NTFS#Internals

UTF8 is basically an elegant hack which allows byte-aligned namespaces to easily be adapted. ZFS is the outlier, here.

Edit: UNIX users often have issues using files from Windows: https://forum.transmissionbt.com/viewtopic.php?t=10948 http://docs.freebsd.org/cgi/getmsg.cgi?fetch=124408+0+/usr/local/www/mailindex/archive/2011/freebsd-fs/20110410.freebsd-fs

JMoVS commented 5 years ago

stale for 4 years

CharlesJS commented 3 years ago

Requesting that this issue be reopened; with O3X 2.0.1 on macOS 11.4 with zstd enabled, this now results in a kernel panic rather than a simple file copy error.

lundman commented 3 years ago

Could we get some concrete examples, copy paste in filenames that fail. The dataset property values of normalization, casesensitivity. And panic stack.

CharlesJS commented 3 years ago

This (rather contrived, but it does the job) filename reproduces the panic on my machine:

ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSÜ

Normalization and casesensitivity:

$ zfs get normalization ZFS  
NAME  PROPERTY       VALUE          SOURCE
ZFS   normalization  formD          -

$ zfs get casesensitivity ZFS
NAME  PROPERTY         VALUE        SOURCE
ZFS   casesensitivity  insensitive  -

Here's the portion of the panic.ips file containing the macOSPanicString; if you need the whole file, I can send this to you privately, but it is too large to post here.

  "build" : "Bridge OS 5.4 (18P4663)",
  "crashReporterKey" : "c0dec0dec0dec0dec0dec0dec0dec0dec0de0001",
  "date" : "2021-06-06 03:07:20.57 +0000",
  "incident" : "C56AC03C-12EF-459D-8816-ED39BDB00975",
  "kernel" : "Darwin Kernel Version 20.5.0: Fri May  7 22:04:14 PDT 2021; root:xnu-7195.121.3~8\/RELEASE_ARM64_T8010",
  "macOSOtherString" : "\n** In Memory Panic Stackshot Succeeded ** Bytes Traced 422736 (Uncompressed 1140752) **\n",
  "macOSPanicFlags" : "0xc",
  "macOSPanicString" : "panic(cpu 6 caller 0xffffff800e04f642): \"Kernel stack memory corruption detected\"@\/System\/Volumes\/Data\/SWE\/macOS\/BuildRoots\/e90674e518\/Library\/Caches\/com.apple.xbs\/Sources\/xnu\/xnu-7195.121.3\/libkern\/stack_protector.c:37\nBacktrace (CPU 6), Frame : Return Address\n0xffffffc139443350 : 0xffffff800e08e0dd \n0xffffffc1394433a0 : 0xffffff800e1d4f33 \n0xffffffc1394433e0 : 0xffffff800e1c552a \n0xffffffc139443430 : 0xffffff800e032a2f \n0xffffffc139443450 : 0xffffff800e08d8fd \n0xffffffc139443570 : 0xffffff800e08dbf3 \n0xffffffc1394435e0 : 0xffffff800e89d81a \n0xffffffc139443650 : 0xffffff800e04f642 \n0xffffffc139443660 : 0xffffff7faac8cea2 \n0xffffffc139443800 : 0xffffff800e31172e \n0xffffffc139443960 : 0xffffff800e310a14 \n0xffffffc139443b80 : 0xffffff800e32c599 \n0xffffffc139443be0 : 0xffffff800e3326de \n0xffffffc139443f20 : 0xffffff800e332fe4 \n0xffffffc139443f40 : 0xffffff800e73fc9e \n0xffffffc139443fa0 : 0xffffff800e0331f6 \n      Kernel Extensions in backtrace:\n         org.openzfsonosx.zfs(2.0.1)[3DF3A2C7-1696-3C3A-B1E3-9AF89C04EEB8]@0xffffff7faab75000->0xffffff7faaea7fff\n            dependency: com.apple.iokit.IOStorageFamily(2.1)[A0D72FE9-649B-316A-8B5C-934E295FF0B5]@0xffffff8010c6d000->0xffffff8010c7efff\n\nProcess name corresponding to current thread: cp\nBoot args: chunklist-security-epoch=0 -chunklist-no-rev2-dev chunklist-security-epoch=0 -chunklist-no-rev2-dev\n\nMac OS version:\n20F71\n\nKernel version:\nDarwin Kernel Version 20.5.0: Sat May  8 05:10:33 PDT 2021; root:xnu-7195.121.3~9\/RELEASE_X86_64\nKernel UUID: 52A1E876-863E-38E3-AC80-09BBAB13B752\nKernelCache slide: 0x000000000de00000\nKernelCache base:  0xffffff800e000000\nKernel slide:      0x000000000de10000\nKernel text base:  0xffffff800e010000\n__HIB  text base: 0xffffff800df00000\nSystem model name: MacBookPro16,1 (Mac-E1008331FDC96864)\nSystem shutdown begun: NO\nHibernation exit count: 0\n\nSystem uptime in nanoseconds: 982978425855\nLast Sleep:           absolute           base_tsc          base_nano\n  Uptime  : 0x000000e4de14774c\n  Sleep   : 0x0000000000000000 0x0000000000000000 0x0000000000000000\n  Wake    : 0x0000000000000000 0x0000001c3ec92d3a 0x0000000000000000\nlast started kext at 915051321589: @filesystems.smbfs\t3.6 (addr 0xffffff7fa993a000, size 487424)\nlast stopped kext at 250425925602: >!AThunderboltEDMSink\t5.0.3 (addr 0xffffff7fa8b4f000, size 32768)\nloaded kexts:\norg.openzfsonosx.zfs\t2.0.1\n@filesystems.smbfs\t3.6\n>AGPM\t122.1\n>!APlatformEnabler\t2.7.0d0\n>X86PlatformShim\t1.0.0\n@filesystems.autofs\t3.0\n@fileutil\t20.036.15\n>!ATopCaseHIDEventDriver\t4050.1\n>!AHIDALSService\t1\n>AudioAUUC\t1.70\n@kext.AMDRadeonServiceManager\t4.0.5\n@kext.AMDRadeonX6000\t4.0.5\n>!AUpstreamUserClient\t3.6.8\n>!AGraphicsDevicePolicy\t6.3.3\n>!A!IKBLGraphics\t16.0.4\n@AGDCPluginDisplayMetrics\t6.3.3\n>pmtelemetry\t1\n|IOUserEthernet\t1.0.1\n>usb.!UUserHCI\t1\n>!A!ICFLGraphicsFramebuffer\t16.0.4\n>AGDCBacklightControl\t6.3.3\n>BridgeAudioCommunication\t140.4\n>!AAVEBridge\t6.1\n>!ABridgeAudio!C\t140.4\n>!AGFXHDA\t100.1.433\n>!AMuxControl2\t6.3.3\n>!AMCCSControl\t1.14\n>!A!IPCHPMC\t2.0.1\n>!AThunderboltIP\t4.0.3\n|IO!BSerialManager\t8.0.5d7\n@Dont_Steal_Mac_OS_X\t7.0.0\n>!AHV\t1\n>!ADiskImages2\t1\n>!A!ISlowAdaptiveClocking\t4.0.0\n|SCSITaskUserClient\t436.121.1\n>BCMWLANFirmware4378.Hashstore\t1\n>BCMWLANFirmware4377.Hashstore\t1\n>BCMWLANFirmware4364.Hashstore\t1\n>BCMWLANFirmware4355.Hashstore\t1\n>!AFileSystemDriver\t3.0.1\n@filesystems.tmpfs\t1\n@filesystems.hfs.kext\t556.100.11\n@BootCache\t40\n@!AFSCompression.!AFSCompressionTypeZlib\t1.0.0\n@!AFSCompression.!AFSCompressionTypeDataless\t1.0.0d1\n>!ABCMWLANBusInterfacePCIeMac\t1\n@filesystems.apfs\t1677.120.9\n>!A!II210Ethernet\t2.3.1\n@private.KextAudit\t1.0\n>!ASmartBatteryManager\t161.0.0\n>!AACPIButtons\t6.1\n>!ASMBIOS\t2.1\n>!AACPIEC\t6.1\n>!AAPIC\t1.7\n@!ASystemPolicy\t2.0.0\n@nke.applicationfirewall\t311\n|IOKitRegistryCompatibility\t1\n|EndpointSecurity\t1\n@kext.triggers\t1.0\n>!AActuatorDriver\t4440.3\n>!AHIDKeyboard\t224\n>!AMultitouchDriver\t4440.3\n>!AInputDeviceSupport\t4400.35\n>!AHS!BDriver\t4050.1\n>IO!BHIDDriver\t8.0.5d7\n@kext.AMDRadeonX6100HWLibs\t1.0\n@kext.AMDRadeonX6000HWServices\t4.0.5\n|IO!BHost!CUARTTransport\t8.0.5d7\n|IO!BHost!CTransport\t8.0.5d7\n|IOAccelerator!F2\t442.9\n>!ABacklightExpert\t1.1.0\n>!A!ILpssUARTv1\t3.0.60\n>!A!ILpssUARTCommon\t3.0.60\n>!AOnboardSerial\t1.0\n>!AGraphicsControl\t6.3.3\n>X86PlatformPlugin\t1.0.0\n>!ASMBus!C\t1.0.18d1\n>IOPlatformPlugin!F\t6.0.0d8\n|IOAVB!F\t940.4\n|IONDRVSupport\t585.1\n>!UAudio\t405.39\n|IOAudio!F\t300.6.1\n@vecLib.kext\t1.2.0\n@kext.AMDRadeonX6000Framebuffer\t4.0.5\n@kext.AMDSupport\t4.0.5\n@!AGPUWrangler\t6.3.3\n@!AGraphicsDeviceControl\t6.3.3\n|IOGraphics!F\t585.1\n|IOSlowAdaptiveClocking!F\t1.0.0\n@plugin.IOgPTPPlugin\t985.2\n>usb.IOUSBHostHIDDevice\t1.2\n>usb.cdc.ecm\t5.0.0\n>usb.cdc.ncm\t5.0.0\n>usb.!UHub\t1.2\n>!AThunderboltPCIUpAdapter\t4.1.1\n>!AThunderboltDPOutAdapter\t8.1.4\n>usb.cdc\t5.0.0\n>usb.networking\t5.0.0\n>usb.!UHostCompositeDevice\t1.2\n>!AThunderboltPCIDownAdapter\t4.1.1\n>!AThunderboltDPInAdapter\t8.1.4\n>!AThunderboltDPAdapter!F\t8.1.4\n>!AHPM\t3.4.4\n>!A!ILpssI2C!C\t3.0.60\n>!A!ILpssI2C\t3.0.60\n>!A!ILpssDmac\t3.0.60\n>!ABSDKextStarter\t3\n|IOSurface\t290.8.1\n@filesystems.hfs.encodings.kext\t1\n>!ABCMWLANCoreMac\t1.0.0\n|IOSerial!F\t11\n|IO80211!FV2\t1200.12.2b1\n|IOSkywalk!F\t1\n>IOImageLoader\t1.0.0\n>corecapture\t1.0.4\n>!AXsanScheme\t3\n>usb.!UVHCIBCE\t1.2\n>usb.!UVHCICommonBCE\t1.0\n>usb.!UVHCI\t1.2\n>usb.!UVHCICommon\t1.0\n>!AEffaceableNOR\t1.0\n|IOBufferCopy!C\t1.1.0\n|IOBufferCopyEngine!F\t1\n|IONVMe!F\t2.1.0\n>!AThunderboltNHI\t7.2.8\n|IOThunderbolt!F\t9.3.2\n|IOEthernetAVB!C\t1.1.0\n>mDNSOffloadUserClient\t1.0.1b8\n>usb.!UXHCIPCI\t1.2\n>usb.!UXHCI\t1.2\n>usb.!UHostPacketFilter\t1.0\n|IOUSB!F\t900.4.2\n>!AEFINVRAM\t2.1\n>!AEFIRuntime\t2.1\n>!ASMCRTC\t1.0\n|IOSMBus!F\t1.1\n|IOHID!F\t2.0.0\n$!AImage4\t3.0.0\n|IOTimeSync!F\t985.2\n|IONetworking!F\t3.4\n>DiskImages\t493.0.0\n|IO!B!F\t8.0.5d7\n|IOReport!F\t47\n|IO!BPacketLogger\t8.0.5d7\n$quarantine\t4\n$sandbox\t300.0\n@kext.!AMatch\t1.0.0d1\n|CoreAnalytics!F\t1\n>!ASSE\t1.0\n>!AKeyStore\t2\n>!UTDM\t511.120.2\n|IOUSBMass!SDriver\t184.121.1\n|IOSCSIBlockCommandsDevice\t436.121.1\n|IO!S!F\t2.1\n|IOSCSIArchitectureModel!F\t436.121.1\n>!AMobileFileIntegrity\t1.0.5\n@kext.CoreTrust\t1\n>!AFDEKeyStore\t28.30\n>!AEffaceable!S\t1.0\n>!ACredentialManager\t1.0\n>KernelRelayHost\t1\n|IOUSBHost!F\t1.2\n>!UHostMergeProperties\t1.2\n>usb.!UCommon\t1.0\n>!ABusPower!C\t1.0\n>!ASEPManager\t1.0.1\n>IOSlaveProcessor\t1\n>!AACPIPlatform\t6.1\n>!ASMC\t3.1.9\n|IOPCI!F\t2.9\n|IOACPI!F\t1.4\n>watchdog\t1\n@kec.pthread\t1\n@kec.corecrypto\t11.1\n@kec.Libm\t1\n",
lundman commented 3 years ago

OK yes, that very easily caused a panic, it would write over the space given when it decomposed the character at the end.

Hopefully these two will help: https://git.io/JnsCZ https://git.io/JnsiT

# zfs create -o normilazation=formD -o casesensitivity=insensitive BOOM/formD

# touch ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSÜ

touch: ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSÜ: File name too long

# delete one character:
# touch ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPRSÜ 

# ls -la
total 49
drwxr-xr-x   6 root  wheel   6 Jun 16 13:01 .
drwxr-xr-x  21 root  wheel  21 Jun 16 12:48 ..
drwx------   4 root  wheel   4 Jun 16 12:48 .Spotlight-V100
-rw-r--r--   1 root  wheel   0 Jun 16 13:01 ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPRSÜ