Open phu54321 opened 2 years ago
Can you please attach the problematic report file with duplicates?
Here is some additional info for this issue (still present in 0.29.3)
Report file:
# Report by fclones 0.29.3
# Timestamp: 2023-02-09 14:04:44.822 +1100
# Command: 'C:\Users\c22\.cargo\bin\fclones.exe' group .
# Base dir: C:\\Users\\c22\\Desktop\\DupeTest
# Total: 12 B (12 B) in 3 files in 1 groups
# Redundant: 8 B (8 B) in 2 files
# Missing: 0 B (0 B) in 0 files
718ac45146ab06cd8f7d7c20c1ea6d66, 4 B (4 B) * 3:
C:\\Users\\c22\\Desktop\\DupeTest\\🍔🍔🍔.txt
C:\\Users\\c22\\Desktop\\DupeTest\\😊😊😊.txt
C:\\Users\\c22\\Desktop\\DupeTest\\🤗🤗🤗.txt
Attempt to dedupe:
PS C:\Users\c22\Desktop\DupeTest> fclones.exe group . | fclones link
[2023-02-09 06:05:03.785] fclones.exe: info: Started grouping
[2023-02-09 06:05:03.788] fclones.exe: info: Scanned 4 file entries
[2023-02-09 06:05:03.789] fclones.exe: info: Found 3 (12 B) files matching selection criteria
[2023-02-09 06:05:03.789] fclones.exe: info: Found 2 (8 B) candidates after grouping by size
[2023-02-09 06:05:03.789] fclones.exe: info: Found 2 (8 B) candidates after grouping by paths
[2023-02-09 06:05:03.790] fclones.exe: info: Found 2 (8 B) candidates after grouping by prefix
[2023-02-09 06:05:03.791] fclones.exe: info: Found 2 (8 B) candidates after grouping by suffix
[2023-02-09 06:05:03.791] fclones.exe: info: Found 2 (8 B) redundant files
[2023-02-09 06:05:03.810] fclones.exe: info: Started deduplicating
[2023-02-09 06:05:03.813] fclones.exe: warn: Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': The filename, directory name, or volume label syntax is incorrect. (os error 123)
[2023-02-09 06:05:03.813] fclones.exe: warn: Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': The filename, directory name, or volume label syntax is incorrect. (os error 123)
[2023-02-09 06:05:03.813] fclones.exe: warn: Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': Failed to read metadata of 'C:\Users\c22\Desktop\DupeTest\????????????.txt': The filename, directory name, or volume label syntax is incorrect. (os error 123)
[2023-02-09 06:05:03.813] fclones.exe: warn: Could not determine files to drop in group with hash 718ac45066ab06cd8f7d7c20c1ea6d66 and len 4: Metadata of some files could not be obtained
[2023-02-09 06:05:03.813] fclones.exe: info: Processed 0 files and reclaimed 0 B space
Result is that no files are de-duplicated.
I can possibly take a stab at a fix for this if I get some time.
I tested both on Windows in CMD as well as in Wine and it handles the "hamburger" emojis just fine. However, one thing in common in the problems reported above is PowerShell.
https://github.com/PowerShell/PowerShell/issues/15871
Looks like powershell additionally reinterprets the encoding when the content is piped between two programs. So fclones link
doesn't get the same content that was output by fclones group
.
Weird issue. I'm okay with using cmd, so
fclones group
and fclones link
would be most-used combinations of this program, so preferrably link
should be usable as a flag to group
command, so no piping is necessary.I'm not saying thete is nothing to do here. I'm thinking about a workaround. There are a few things I need to try. Maybe adding a BOM on Windows would help. Or I just escape all non ASCII characters on Windows (or as an option).
Good catch @pkolaczk. Turns out the issue was not what I first thought it would be, but your digging has helped me find a workaround that still allows a user to use PowerShell.
Run [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8
first.
ie.
Without:
PS C:\Users\c22\Desktop\DupeTest> fclones group . | Out-Default
[2023-02-15 11:32:26.973] fclones.exe: info: Started grouping
[2023-02-15 11:32:26.977] fclones.exe: info: Scanned 4 file entries
[2023-02-15 11:32:26.977] fclones.exe: info: Found 3 (12 B) files matching selection criteria
[2023-02-15 11:32:26.978] fclones.exe: info: Found 2 (8 B) candidates after grouping by size
[2023-02-15 11:32:26.978] fclones.exe: info: Found 2 (8 B) candidates after grouping by paths
[2023-02-15 11:32:26.988] fclones.exe: info: Found 2 (8 B) candidates after grouping by prefix
[2023-02-15 11:32:26.989] fclones.exe: info: Found 2 (8 B) candidates after grouping by suffix
[2023-02-15 11:32:26.990] fclones.exe: info: Found 2 (8 B) redundant files
# Report by fclones 0.29.3
# Timestamp: 2023-02-15 11:32:26.991 +1100
# Command: 'C:\Users\c22\.cargo\bin\fclones.exe' group .
# Base dir: C:\\Users\\c22\\Desktop\\DupeTest
# Total: 12 B (12 B) in 3 files in 1 groups
# Redundant: 8 B (8 B) in 2 files
# Missing: 0 B (0 B) in 0 files
718ac45146ab06cd8f7d7c20c1ea6d66, 4 B (4 B) * 3:
C:\\Users\\c22\\Desktop\\DupeTest\\🍔🍔🍔.txt
C:\\Users\\c22\\Desktop\\DupeTest\\😊😊😊.txt
C:\\Users\\c22\\Desktop\\DupeTest\\🤗🤗🤗.txt
With:
PS C:\Users\c22\Desktop\DupeTest> [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8
PS C:\Users\c22\Desktop\DupeTest> fclones group . | Out-Default
[2023-02-15 11:32:37.765] fclones.exe: info: Started grouping
[2023-02-15 11:32:37.770] fclones.exe: info: Scanned 4 file entries
[2023-02-15 11:32:37.770] fclones.exe: info: Found 3 (12 B) files matching selection criteria
[2023-02-15 11:32:37.771] fclones.exe: info: Found 2 (8 B) candidates after grouping by size
[2023-02-15 11:32:37.771] fclones.exe: info: Found 2 (8 B) candidates after grouping by paths
[2023-02-15 11:32:37.781] fclones.exe: info: Found 2 (8 B) candidates after grouping by prefix
[2023-02-15 11:32:37.781] fclones.exe: info: Found 2 (8 B) candidates after grouping by suffix
[2023-02-15 11:32:37.782] fclones.exe: info: Found 2 (8 B) redundant files
# Report by fclones 0.29.3
# Timestamp: 2023-02-15 11:32:37.783 +1100
# Command: 'C:\Users\c22\.cargo\bin\fclones.exe' group .
# Base dir: C:\\Users\\c22\\Desktop\\DupeTest
# Total: 12 B (12 B) in 3 files in 1 groups
# Redundant: 8 B (8 B) in 2 files
# Missing: 0 B (0 B) in 0 files
718ac45146ab06cd8f7d7c20c1ea6d66, 4 B (4 B) * 3:
C:\\Users\\c22\\Desktop\\DupeTest\\🍔🍔🍔.txt
C:\\Users\\c22\\Desktop\\DupeTest\\😊😊😊.txt
C:\\Users\\c22\\Desktop\\DupeTest\\🤗🤗🤗.txt
There seems to be a documented way to set your system to always use UTF-8 but it sounds like it could have potential compatibility issues.
I wonder if there is a way that fclones could a) detect it's running in PowerShell and b) set that property temporarily.
That might be asking too much, as this really seems more like a PowerShell issue.
Can this theoretically be solved by introducing the -i|--input
parameter for link
and other commands to specify the file explicitly instead of piping it into stdin?
Meanwhile, I found that Use-RawPipeline
module helps. For anyone with a similar issues, here's a temporary workaround: https://github.com/GeeLaw/PowerShellThingies/tree/master/modules/Use-RawPipeline
Setting [Console]::OutputEncoding
and [Console]::InputEncoding
and $OutputEncoding
, as well as changing the codepage didn't help me for some reason.
The actual directory name is
[π/3] DistorteD MoonlighT
and[縺輔°縺阪??縺帙▽縲?縺セ縺医□] 豁サ縺ォ縺溘縺ェ縺
. (Yeah that's really a filename) It seems like fclones couldn't recognize Unicode names here.cargo install fclones
, wherecargo
is installed withrustup
.Thanks