maharmstone / ntfs2btrfs

GNU General Public License v2.0
798 stars 34 forks source link

Non UTF-8 filenames gives vague error #12

Open BlueAmulet opened 3 years ago

BlueAmulet commented 3 years ago

Downloaded version 20210105

When I run ntfs2btrfs.exe D:\, all I get back instantly is just the message wstring_convert::to_bytes and it does nothing And the drive I want to convert is infact D:\ and it is formatted as NTFS, has no compressed files although I see the newest version added in compressed file support.

maharmstone commented 3 years ago

It looks like it's complaining because the parameter is invalid UTF-8, for whatever reason. Did you copy and paste the command line from somewhere, and it's got invisible crud on the end? Or possibly using a non-English keyboard, and typing a non-Latin character which is identical to "D"? (I'm not sure there are any.)

What version of Windows are you on, and which language?

TheMadHau5 commented 3 years ago

Besides Eth, or D with a pre-composed accent, there are a few mathematical symbols but I'd assume they aren't easy to type. (Full list here: http://www.unicode.org/Public/security/latest/confusables.txt; I searched for LATIN CAPITAL LETTER D)

BlueAmulet commented 3 years ago

I'm using Windows 10 2004 in English, US-QWERTY keyboard, after testing a bit I can see that it actually says "Processing inode" and gets to around ... 60? before it just says "wstring_convert::to_bytes" and gives up

BlueAmulet commented 3 years ago

The problem was because of some non UTF-16 filenames in a Windows.old\$RECYCLE.BIN, after removing them conversion went smoothly

maharmstone commented 3 years ago

Thanks. Are you able to tell me what they were, so I can reproduce it? NTFS stores all filenames as UTF-16, and I wasn't aware there was any UTF-16 string that wstring_convert would refuse to convert to UTF-8...

BlueAmulet commented 3 years ago

Unfortunately I don't have them anymore. To be honest I think that drive might have been corrupted at some point. $Recycle.Bin is not supposed to be all uppercase and it's contents are supposed to be 8.3 sized filenames starting with dollar signs, not long filenames with garbage characters at the beginning.

It was pretty easy to get ntfs2btrfs to fail again on a test image though. NTFS names are usually UTF-16 but they're allowed to contain any 16bit value aside from 0, and CreateFileW will happily create a file with bad surrogates. But not something that is normally done, so feel free to close the issue.

maharmstone commented 3 years ago

Bad surrogates was the only thing I could think of... I'll change it so it gives a warning and skips the inode, rather than stopping the whole thing entirely.