Open pureair opened 5 months ago
Encountered this error myself and decided to take a whack at it. Thanks for the test data, it makes things a million times easier!
I think with byte strings you need to use the dummy encoder unicode_escape
when decoding strings containing escaped multi-byte unicode characters, but when I tried this it returned the right filenames but ADB couldn't find the files. Never mind, looks like it fixes the non-breaking space that's causing sync issues but breaks on actual multi-byte characters.
I'm guessing it only has to be decoded this way on one side (wherever it's actually broken), but I'm unfamiliar with the codebase. I'll tinker with the code a bit and see if I can get things working.
I have also tried to tackle this problem, there are indeed nbsp characters present in the stdout. What I have done to fix the nbsp problem is to get the output as raw bytes, iterate over each byte and fix any nbsp charcodes that are not part of a multi-byte utf-8 character. This is sort of google's issue as well for not parsing wildcard (*) characters to match these problematic filenames during pull, though I could be wrong.
OK, it seems replacing non-breaking space 0xa0 with regular space solves the problem. I wrote a powershell script to batch rename files:
# Get all files in the current directory and sub-directory
$files = Get-ChildItem -File -Recurse
foreach ($file in $files) {
# Create the new file name by replacing spaces with "$~#"
$newFileName = $file.Name -replace [char]0x00A0 , '_'
# Define the full path for the new file name
$newFilePath = Join-Path -Path $file.DirectoryName -ChildPath $newFileName
# Rename the file if the new name is different
if ($file.FullName -ne $newFilePath) {
echo Renaming "$file.FullName"
Rename-Item -Path $file.FullName -NewName $newFilePath
}
}
pause
Not sure how to run bash terminal in android with file access to /sdcard/, so not able to write a bash equivalent and test on android.
Environment:
Python version: 3.12.2 adb version: Android Debug Bridge version 1.0.41 / Version 35.0.1-11580240 Operating System: Windows 10
Error Description:
The adbsync command fails with a UnicodeDecodeError when encountering specific folder/file names on the Android device. The error message indicates:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 62: invalid start byte
Full error log is at the end of this report.Investigation:
I have narrowed down the problematic folder/files and included an archive (tester.tar.gz) containing their names and folder structure for further analysis. The content of the files have been emptied, only their filenames are kept.
Each of the three folders would be:
they will not be pulled with "--adb-encoding gb2312" or gbk, gb18080, utf-8, utf-16, etc. because of
'utf-8' codec can't decode byte 0x0b in position xx
(0xe9, 0xb8, etc.).I actually look at the hex of the folder and files names of the folder with only English characters, there is actually no 0xa0 in either the folder name or the file names.
Full error log: