cjnaz / rclonesync-V2

A Bidirectional Cloud Sync Utility using rclone
MIT License
355 stars 39 forks source link

Python 2.7 Filters file problem #9

Closed erakli closed 6 years ago

erakli commented 6 years ago

If ran with Python 2.7.12 and Filters file enabled, rclonesync-V2 returns error:

2018-08-08 16:33:33,707:  Using filters-file  </home/egor-home/git/rclonesync-V2/Filters>
Traceback (most recent call last):
  File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 653, in <module>
    status = bidirSync()
  File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 72, in bidirSync
    current_file_hash = hashlib.md5(ifile.read().replace("\r", "").encode('utf-8')).hexdigest()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 588: ordinal not in range(128)

If ran with Python 3.6.5, all is fine.

The reason is that ifile is read in ascii in 2.7.

Commit used c8360ebcc160765e0d80aa70e0df73aa47ef75d5.

cjnaz commented 6 years ago

I have very little experience with Python 3.x, and I hacked around a bit to find a file read that would work on both 2.7 and 3.x, but I only tested it with ascii (< 0x7F) file contents. I assume you have extended ascii content in your filters file.

Do you have a suggestion on how this should be coded?

Hunting for proper solutions, I find https://stackoverflow.com/questions/42795042/how-to-cast-a-string-to-bytes-without-encoding. Please try changing the ('utf-8') to ('latin1') and checking on both Python versions. I'm looking for a consistent hash of the file on both versions.

Thanks! cjn

cjnaz commented 6 years ago

@erakli , Did you try this with the latin1 edit? Can I close this issue?

erakli commented 6 years ago

@cjnaz , I'm awfully soory. I forgot to test it. :)

So, latin1 hack has no effect:

2018-09-18 00:26:32,495:  Using filters-file  </home/egor-home/git/rclonesync-V2/Filters>
Traceback (most recent call last):
  File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 653, in <module>
    status = bidirSync()
  File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 72, in bidirSync
    current_file_hash = hashlib.md5(ifile.read().replace("\r", "").encode('latin1')).hexdigest()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 588: ordinal not in range(128)

Your assumptions are correct. My Filters file has UTF-8 encoding and contains russian symbols. For example, this is part of my Filters:

# files that satisfies this pattern will be excluded
- /Расшаренное/**
- /Lightroom/**
- /iPad 1/**
- /Поездки/**

I assume, I know, how to fix it.

cjnaz commented 6 years ago

Interesting. Thanks for the filters file snip. I'll see if I can find a way to hash it successfully with the same hash result on both 2.7 and 3.x. Got any ideas?

On Mon, Sep 17, 2018 at 2:32 PM Egor Panfilov notifications@github.com wrote:

@cjnaz https://github.com/cjnaz , I'm awfully soory. I forgot to test it. :)

So, latin1 hack has no effect:

2018-09-18 00:26:32,495: Using filters-file </home/egor-home/git/rclonesync-V2/Filters> Traceback (most recent call last): File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 653, in status = bidirSync() File "/home/egor-home/git/rclonesync-V2/rclonesync.py", line 72, in bidirSync current_file_hash = hashlib.md5(ifile.read().replace("\r", "").encode('latin1')).hexdigest() UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 588: ordinal not in range(128)

Your assumptions are correct. My Filters file has UTF-8 encoding and contains russian symbols. For example, this is part of my Filters:

files that satisfies this pattern will be excluded

  • /Расшаренное/**
  • /Lightroom/**
  • /iPad 1/**
  • /Поездки/**

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cjnaz/rclonesync-V2/issues/9#issuecomment-422178978, or mute the thread https://github.com/notifications/unsubscribe-auth/AOKq4SeaqYaM3X8TrNgbRgbcm2v-f-cXks5ucBTogaJpZM4Vz9Fi .

erakli commented 6 years ago

Please, see PQ #11 and test it on Windows.

I think, if we need to calculate something on bytes, then we simply need to use binary mode! :)

cjnaz commented 6 years ago

Merged. Thank you.