Open doubleaxe opened 1 week ago
It is probably part of NFC/NFD saga on macOS.
rclone and most backends do not care if character is normalised as NFC or NFD but possibly mailru does. As a workaround try to normalise all names before either to NFC or NFD and see what works.
convmv -r -f utf8 -t utf8 --nfc --notest /path/to/files
convmv -r -f utf8 -t utf8 --nfd --notest /path/to/files
Try as well rclone flags:
--no-unicode-normalization Don't normalize unicode characters in filenames
--local-unicode-normalization Apply unicode NFC normalization to paths and filenames
--no-unicode-normalization
and --local-unicode-normalization
don't help.
convmv -r -f utf8 -t utf8 --nfc --preserve-mtimes --notest /data/media/Photo
This works, after encoding is fixed - everything works fine. I guess this is because initially files was on HFS+ filesystem, and later it was converted to APFS during MacOS upgrade.
Workaround is found, bug report could be closed now. Thank you for suggestions.
It is something rclone does not take into account today - that some remotes only support specific normalisation. Very rare but it happens. Here mailru and I remember from the forum exactly the same issue with one S3 provider.
I wonder if we should have at least flag forcing specific normalization at remote? Or too small issue to bother and more trouble than it is worth?
@ncw @nielash what do you think?
If we wanted to do this we would need to make a feature flag for the backend, then an integration test to make sure it was set correctly. This would then tell us which backends it would need to be set on and we could then use the feature flag in the core of rclone to force normalisation on.
My first concern would be what is the feature flag testing? That UTF-8 normalisation is required by the backend? Or maybe that the backend forces UTF-8 normalisation because I know some backend do that too.
I suppose some backends might do something more complicated like only require normalisation for Cyrillic.
It is probably quite a big project for perhaps little gain since we don't get many issues about it. @albertony I know you've worked on the normalisation code in the past - any thoughts?
What is the problem you are having with rclone?
As title states I get error "invalid characters in object name" for paths, which contain unicode U+0439 (cyrylic й letter). I believe that this issue is MacOS specific, but I cannot test it on Windows yet. I came across this post, which describes something similar: https://www.alfredforum.com/topic/2015-encoding-issue/. Probably this issue also affects other cloud providers.
I am js developer, not golang developer, but in my opinion this is weird MacOS UTF-8 normalization issue. For example
encodeURIComponent('й')
will produce%D0%B9
, butencodeURIComponent('ый')
will produce%D1%8B%D0%B8%CC%86
- the same character is encoded differently as%D0%B8%CC%86
.What is your rclone version (output from
rclone version
)Which OS you are using and how many bits (e.g. Windows 7, 64 bit)
MacOS Ventura 13.6.7 Intel 64 bit
Which cloud storage system are you using? (e.g. Google Drive)
Cloud mail.ru
The command you were trying to run (e.g.
rclone copy /tmp remote:tmp
)A log from the command with the
-vv
flag (e.g. output fromrclone -vv copy /tmp remote:tmp
)How to use GitHub