Closed andylemin closed 3 weeks ago
Tried with both default and MySQL CNID's. No change in fault.
Filtered log attached cat /var/log/afp.log | grep test2 > /var/log/afp.log.test2.log
MacOS Finder client tried to open test2だ゙.txt
just once. Log seems to repeat same messages over and over with no obvious error for cause.
Thanks
More examples of Unicode characters which do not work over Netatalk AFP including multiple languages; ï, ѓ, じ, ど, パ, ブ, プ
@andylemin Were you able to get around to building netatalk4 with v14 of UnicodeData.txt as we discussed, to see if that was the breaking point?
It should just be a matter of downloading https://www.unicode.org/Public/14.0.0/ucd/UnicodeData.txt and then point -Dwith-unicode-data-path to the directory where you put it...
That's the next step for this weekend. Wanted to validate scope of issue first
I suspect this is a problem with decomposed Unicode characters somewhere.
I have not been able to replicate the problem with my testing server running 4.0.2. Client is macOS Catalina. Since I don't have any sort of IME setup, I was cutting and pasting the "bad" characters from this page into file names. Tried creating on server first then reading from client. Copied files to/from the server, etc. All seem to work with no issues reading or writing the file.
One side note: Firefox seems to have issues rendering だ゙. A diacritic appears to float to the far right of the character, usually over the "x" in .txt!
@andylemin Can you please confirm the contents of f.e. the generated libatalk/unicode/precompose.c
source file when you get this issue? I found one potential fail state where the path to UnicodeData.txt doesn't resolve properly and the Perl script generates empty tables like this;
static const struct {
unsigned int replacement;
unsigned int base;
unsigned int comb;
} precompositions[] = {
};
static const struct {
unsigned int replacement;
unsigned int base;
unsigned int comb;
} decompositions[] = {
};
static const struct {
unsigned int replacement_sp;
unsigned int base_sp;
unsigned int comb_sp;
} precompositions_sp[] = {
};
static const struct {
unsigned int replacement_sp;
unsigned int base_sp;
unsigned int comb_sp;
} decompositions_sp[] = {
};
I doubt this is the exact problem you're having, because without the precompose tables, interaction with any Unicode character causes errors. But maybe we can get a hint to what's going on by looking at the generated sources.
Apart from this scenario, I haven't been able to reproduce exactly the bug you're seeing...
@andylemin I've made some improvements in https://github.com/Netatalk/netatalk/pull/1692 that should at least prevent the issue where the code generation fails silently. Again, unlikely that is solves your issue but please try the latest main
code just in case the added error handling catches another corner case.
Hi. Ok so some interesting findings to share;
4.0.2 - Unicode 16 - NO -Dwith-unicode-data-path
- Characters Fail 🚫 - NO libatalk/unicode/precompose.c
4.0.2 - Unicode 14 - NO -Dwith-unicode-data-path
- Characters Fail 🚫 - NO libatalk/unicode/precompose.c
NB; In both of the above tests, configure step says Using Unicode Character Database: UnicodeData.txt
indicating it is finding the downloaded UnicodeData.txt (in source base path) even though -Dwith-unicode-data-path
is not being set.
4.0.2 - Unicode 16 - WITH -Dwith-unicode-data-path
- Characters Succeed 👍 - But still NO libatalk/unicode/precompose.c
4.0.2 - Unicode 14 - WITH -Dwith-unicode-data-path
- Characters Succeed 👍 - But still NO libatalk/unicode/precompose.c
git clone (last commit log; Shore up Unicode char table script error handling and detection
).
main - Unicode 16 - NO -Dwith-unicode-data-path
- Characters Succeed 👍 - NO libatalk/unicode/precompose.c
after meson compile -C build
main - Unicode 16 - WITH -Dwith-unicode-data-path
- Characters Succeed 👍 - NO libatalk/unicode/precompose.c
after meson compile -C build
Observations;
Unicode version is not related.
Setting -Dwith-unicode-data-path
seems to fix it in last Release, even though configure output says it finds UnicodeData.txt.
In the current git HEAD, the issue seems fixed with or without -Dwith-unicode-data-path
.
I have never seen libatalk/unicode/precompose.c
generated across all tests..
So seems the issue was (4.0.2) related to -Dwith-unicode-data-path
being required in spite of positive configure message.
You have fixed it in HEAD such that -Dwith-unicode-data-path
is no longer required.
I wonder why libatalk/unicode/precompose.c
has never once been successfully generated.
PS; Just to confirm, when rebuilding for each test I am using meson setup --reconfigure build
each time before building. I am not using a clean source tree as --reconfigure
seems to be enough
There should only be a file called precompose.h
generated by make-precompose.h.pl
. Additionally, utf16_case.c
and utf16_casetable.h
are generated by make-casetable.pl
.
So seems the issue was (4.0.2) related to -Dwith-unicode-data-path being required in spite of positive configure message. You have fixed it in HEAD such that -Dwith-unicode-data-path is no longer required. I wonder why libatalk/unicode/precompose.c has never once been successfully generated.
Thanks for the thorough testing. Yes we had a bug in 4.0.2 where Meson itself could find UnicodeData.txt but the Perl script couldn't because of relative path shenanigans... What I do now is to always prepend the absolute path to the source dir whenever you give it a relative dir. I also added the Netatalk source dir to the list of dirs to look for UnicodeData.txt.
PS; Just to confirm, when rebuilding for each test I am using meson setup --reconfigure build each time before building. I am not using a clean source tree as --reconfigure seems to be enough
I think this should be safe. I personally always do git clean -dfx && git reset --hard
between tries to have a completely clean slate.
Describe the bug Since Netatalk 4.x some JP chars fail to be handled correctly. Documents with characters such as だ゙ in the filename result in users being unable to open these files.
Initially discovered with Netatalk 4.0.2 on FreeBSD 14.1 with Ventura clients. Existing files and folders on AFP share after upgrade to 4.0.2 results in permissions errors. After removing problematic chars (on server console via ssh), files become accessible again by clients via share. Clients cannot rename problematic filenames over AFP share.
When trying to reproduce with Netatalk 4.0.2 on Ubuntu and Ventura 13.7 client, trying to open a file with one of the problematic JP chars results in "The document “test だ゙.txt” could not be opened."
To Reproduce Build Netatalk 4.0.2 on Linux or FreeBSD using latest instructions at; https://netatalk.io/4.0/htmldocs/compile Use default afp.conf, add a test share. 1) When trying to save files to AFP share from MacOS clients with characters like だ゙ get error;
The document “Untitled” could not be saved as “testだ゙.txt”. The file doesn’t exist.
2) Create file on server with this problematic char in filename in shared folder using SSH, file is created fine, but when clients try to open file it errors.user@ubuntu:/test$ touch test2だ゙.txt
File created fine on server. When trying to open file on client, get error;The document “test2だ゙.txt” could not be opened. The file doesn’t exist.
FreeBSD install requires UnicodeData.txt download from https://www.unicode.org/Public/UNIDATA/UnicodeData.txt, Ubuntu install requires apt install of unicode-data package. Both have same problem. After downgrading to Netatalk 3.x, all inaccessible files and folders become accessible again.
NB; This does not affect all Unicode chars, only specific chars are impacted. It is unknown how many are impacted since 4.x.
The example for testing is だ゙ Can provide more chars with problem if required.
Expected behavior In Netatalk 2.x and 3.x all JP chars can be used in filenames and folders without issues, and all files and folders can be accessed and opened without issue. Netatalk 4.x should also support all Unicode chars.
Environment
Logs Attach syslogs from the malfunctioning process,
maxdebug
log level afp.log Log shows server start, one test client connecting to 'test' share, tries to opentest2だ゙.txt
file, client fails withThe document “test2だ゙.txt” could not be opened
. Server process stopped.Additional context Netatalk does not crash