osxfuse / osxfuse

FUSE extends macOS by adding support for user space file systems
https://osxfuse.github.io/
Other
8.73k stars 512 forks source link

Broken diacritic or accented chars in macOS Sonoma (äöшђчћ) #975

Open mixtly87 opened 11 months ago

mixtly87 commented 11 months ago

Let's say that in - (NSArray*) contentsOfDirectoryAtPath:(NSString*)path error:(NSError**)error callback I return a list of NSString* objects like this:

Regular string
Däm
Ћирилица
c1

However the problem is that when MacFUSE callbacks attributesOfItemAtPath it passes question mark (?) instead of reported string so I get: D?m instead of Däm or ???????? instead of Ћирилица.

This issue started happening on macOS Sonoma. Did anyone face it and is there a fix for that?

I was already performing the decomposedStringWithCompatibilityMapping on all strings incoming through my callbacks and this served me well so far, but on macOS Sonoma, nothing helped whatever I tried.

- (NSDictionary*) attributesOfItemAtPath:(NSString*)path
                                userData:(id)userData
                                   error:(NSError**)error
{
    path = path.decomposedStringWithCompatibilityMapping;
    NSDictionary* ret = process(path);
    return ret;
}

Is anyone else facing this issue and is there a fix?

bfleischer commented 11 months ago

Please see https://github.com/macfuse/macfuse/wiki/File-Names-(Unicode-Normalization-Forms). macOS expects all file names to be returned in Normalization Form D (NFD). Using any other form can cause unexpected issues.

path.decomposedStringWithCompatibilityMapping returns the string in the Unicode Normalization Form KD. Maybe using the KD form is causing the issues you are running into.

bfleischer commented 11 months ago

I ran a quick test on Sonoma with the LoopbackFS demo file system. I'm not seeing any question marks in the attributesOfItemAtPath:userData:error: callback for paths containing Däm or Ћирилица. I think this is most likely a string encoding issue in your file system code.

mixtly87 commented 11 months ago

Thank you @bfleischer for the quick answer. I tried various things:

  1. using decomposedStringWithCompatibilityMapping
  2. using decomposedStringWithCanonicalMapping
  3. converting path to Data with utf8encoding and then buliding the string with utf16 encoding
  4. converting string to ascii escape sequence (Däm is converted to D\xe4m), but this shows D\xe4m in Finder.

But whatever I tried so far, didn't work; here is an example of my FileSystem delegate class.

- (BOOL) createDirectoryAtPath:(NSString*)path
                    attributes:(NSDictionary*)attributes
                         error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    BOOL ret = [ASTFind(self.pathHandlers, ^BOOL(PathHandler* hf) {
        return [hf shouldHandlePath:path];
    }) createDirectoryAtPath:path attributes:attributes error:error];
    return ret;
}

- (BOOL) createFileAtPath:(NSString*)path
               attributes:(NSDictionary*)attributes
                    flags:(int)flags
                 userData:(id*)userData
                    error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    BOOL ret = [ASTFind(self.pathHandlers, ^BOOL(PathHandler* hf) {
        return [hf shouldHandlePath:path];
    }) createFileAtPath:path attributes:attributes flags:flags userData:userData error:error];
    return ret;
}

#pragma mark File Contents

- (BOOL) openFileAtPath:(NSString*)path
                   mode:(int)mode
               userData:(id*)userData
                  error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    PHFoldersAndDocuments* handler = (PHFoldersAndDocuments*)
            self.pathHandlers.lastObject;
    BOOL ret = [handler openFileAtPath:path mode:mode userData:userData error:error];
    return ret;
}

- (void) releaseFileAtPath:(NSString*)path userData:(id)userData
{
    path = path.decomposedStringWithCanonicalMapping;
    PHFoldersAndDocuments* handler = (PHFoldersAndDocuments*)
            self.pathHandlers.lastObject;
    return [handler releaseFileAtPath:path userData:userData];
}

- (int) readFileAtPath:(NSString*)path
              userData:(id)userData
                buffer:(char*)buffer
                  size:(size_t)size
                offset:(off_t)offset
                 error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    PHFoldersAndDocuments* handler = (PHFoldersAndDocuments*) self.pathHandlers.lastObject;
    return [handler readFileAtPath:path userData:userData buffer:buffer size:size offset:offset error:error];
}

- (int) writeFileAtPath:(NSString*)path
               userData:(id)userData
                 buffer:(const char*)buffer
                   size:(size_t)size
                 offset:(off_t)offset
                  error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    PHFoldersAndDocuments* handler = (PHFoldersAndDocuments*) self.pathHandlers.lastObject;
    return [handler writeFileAtPath:path userData:userData buffer:buffer size:size offset:offset error:error];
}

#pragma mark Directory Contents

- (NSArray*) contentsOfDirectoryAtPath:(NSString*)path error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    NSArray<NSString*>* ret = [ASTFind(self.pathHandlers, ^BOOL(PathHandler* hf) {
        return [hf shouldHandleContentForPath:path];
    }) contentsOfDirectoryAtPath:path error:error];
    return ret;
}

#pragma mark Getting and Setting Attributes

- (NSDictionary*) attributesOfFileSystemForPath:(NSString*)path error:(NSError**)error
{
    NSNumber* tenGibibytes = @(10737418240);
    return @{
            NSFileSystemSize: tenGibibytes,
            NSFileSystemFreeSize: tenGibibytes
    };
}

- (NSDictionary*) attributesOfItemAtPath:(NSString*)path
                                userData:(id)userData
                                   error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    NSDictionary* ret = [ASTFind(self.pathHandlers, ^BOOL(PathHandler* hf) {
        return [hf shouldHandlePath:path];
    }) attributesOfItemAtPath:path userData:userData error:error];
    return ret;
}

- (BOOL) setAttributes:(NSDictionary*)attributes
          ofItemAtPath:(NSString*)path
              userData:(id)userData error:(NSError**)error
{
    path = path.decomposedStringWithCanonicalMapping;
    BOOL ret = [ASTFind(self.pathHandlers, ^BOOL(PathHandler* hf) {
        return [hf shouldHandlePath:path];
    }) setAttributes:attributes ofItemAtPath:path userData:userData error:error];
    return ret;
}

Interesting that you didn't manage to reproduce the issue. I'll look further into my code or make a simplified example.

I record the data into CoreData SQLite database and then read data back from it. But even when I, in contentsOfDirectoryAtPath return a simple array of NSString* objects, readFileAtPath contains question mark.

I wonder if I am missing something with mount options. This is how I set them:

NSArray* options = @[ volName, @"modules=iconv", volIcon, @"auto_cache", @"daemon_timeout=30" ];
[self.fuse mountAtPath:mountPath withOptions:options];
mixtly87 commented 11 months ago

Tried with HelloFS example where I return single Däm.txt file and file is properly shown in Finder. Will continue investigation...

mixtly87 commented 11 months ago

@bfleischer setting mount options as:

NSArray* options = @[ volName, @"modules=iconv", volIcon, @"from_code=UTF-8", @"to_code=UTF-8" ];

gives a hint. Now UTF-8 chars are appearing in Finder.

Also another hint: Adding modules=iconv to mount options in HelloFS also breaks encoding.

bfleischer commented 11 months ago

If I remember correctly, the version of iconv that shipped with macOS has been a buggy in the past. Not sure if this changed with Sonoma. I know that one or two developers addressed this issue (macOS iconv being buggy) by shipping their own version of iconv.

bfleischer commented 6 months ago

https://github.com/osxfuse/osxfuse/issues/706