openzfsonwindows / openzfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
402 stars 15 forks source link

Cannot use special characters/Umlauts when datasets are mounted #348

Closed derritter88 closed 4 months ago

derritter88 commented 5 months ago

System information

Type Version/Name
Distribution Name Win 11
OpenZFS Version zfs-2.2.99-5-ga6951e43bf

When having a pool mounted with multiple datasets like

NAME                USED  AVAIL  REFER  MOUNTPOINT
Backup-Data3       3.38T  3.76T   132K  /Backup-Data3
Backup-Data3/Data  2.08T  3.76T  2.08T  /Backup-Data3/Data
Backup-Data3/VM    1.29T  3.76T  1.29T  /Backup-Data3/VM

in a layout like:

F:\
F:\Data
F:\VM

then I cannot create files with special characters and I cannot access any existing files.

When using driveletter=on for data sets then there are no problems for special characters.

lundman commented 4 months ago

OK, so playing with: E:\lower\Segeln_FB2-Prüfungstörn_2019-05

where E:\lower is a sub-dataset.

What happens is we get a request to open \lower\Segeln_FB2-Prüfungstörn_2019-05 and we detect \lower is a REPARSE point, and correctly return STATUS_REPARSE.

However, we are also supposed to work out how much is remaining of the filename:

            rpb->Reserved = strlen(finalname) * sizeof (WCHAR);
            if (rpb->Reserved != 0)
                rpb->Reserved += sizeof (WCHAR); // the slash

So finalname here is utf8 of Segeln_FB2-Prüfungstörn_2019-05 and comes to 0x25 bytes and at 2-byte chars, 0x4a. Plus a slash, making a total of 0x4c. But, we only used \lower, some 0x6 bytes (0xa). The full 2byte char string given is 0x52 so we should actually return 0x48, and not 0x4c. If I hardcode 0x4c->0x48 I can successfully use the name Segeln_FB2-Prüfungstörn_2019-05

The reason here is strlen() returns bytes, and not characters. Windows always uses 2byte chars, so doing the math without utf8 would be easier, but not fit with ZFS. Doing the math based on "given minus used" would work, but you can also have umlauts in the directory name \löwer - so it would just shift the issue. Having a u8_strlen() to count chars might work.

Pondering.

lundman commented 4 months ago

f886377

derritter88 commented 4 months ago

Thanks for that - will you release a new version for me to test it?

lundman commented 4 months ago

New build is up

derritter88 commented 4 months ago

This works now - thanks!

derritter88 commented 4 months ago

Found a different thing (let me know if you need a new issue ticket): When copying stuff via driveletter=off FreeFileSync has issues with creating new folders ERROR_PATH_NOT_FOUND - [FindFirstFileEx] and translated "cannot read file attributes of folder F:\Data\Mathias\Apple Backup"

I then unmounted the subset and remounted it with driveletter=on - backup/sync worked without any issues.

lundman commented 4 months ago

Yeah make a new tickets, with details on how I can reproduce it