GAM-team / got-your-back

Got Your Back (GYB) is a command line tool for backing up your Gmail messages to your computer using Gmail's API over HTTPS.
https://github.com/GAM-team/got-your-back/wiki
Apache License 2.0
2.64k stars 209 forks source link

UnicodeEncodeError: 'ascii' codec can't encode character '\U0001d000' in position 28781: ordinal not in range(128) #325

Closed jsight closed 2 years ago

jsight commented 3 years ago

The issue tracker is for reporting product deficiencies. How do I questions should be posted to the discussion forum at https://groups.google.com/group/got-your-back. When in doubt, start at the discussion forum and return here only when instructed to do so.

Please confirm the following:

Full steps to reproduce the issue:

  1. /home/jsight/bin/gyb/gyb --email --local-folder /home/jsight/backup/gmail/gotyourback/jessesightler_gmailcom/

Expected outcome (what are you trying to do?): Backup completion

Actual outcome (what errors or bad behavior do you see instead?):

Traceback (most recent call last):
File "gyb.py", line 2429, in File "gyb.py", line 1814, in main File "gyb.py", line 712, in callGAPI File "googleapiclient/_helpers.py", line 134, in positional_wrapper File "googleapiclient/http.py", line 1593, in execute File "googleapiclient/model.py", line 220, in response File "googleapiclient/model.py", line 282, in deserialize File "json/init.py", line 346, in loads File "json/decoder.py", line 337, in decode File "json/decoder.py", line 353, in raw_decode json.decoder.JSONDecodeError: Invalid control character at: line 7 column 112929 (char 113019) [762267] Failed to execute script gyb

Hello1024 commented 2 years ago

Does this error happen repeatedly, or was it a one-off occurrence? If one-off, it looks like the fault might be on googles end.

It seems real unlikely for a JSON error to happen so far into the response using googles official client...

Do you have some kind of "Web firewall" proxy software which is perhaps trying to meddle with the connection between gyb and google?

jay0lee commented 2 years ago

Closing due to lack of detail and reproducibility. Feel free to re-open with more details.

zegor-mjol commented 2 years ago

It happened to me just now. A (large) GMail account was backed up, by GYB, and then to be restored under a different account. The error message is Traceback (most recent call last):15404)
File "gyb.py", line 2532, in File "gyb.py", line 2007, in main File "gyb.py", line 1781, in message_hygiene File "email/message.py", line 178, in as_bytes File "email/generator.py", line 116, in flatten File "email/generator.py", line 181, in _write File "email/generator.py", line 218, in _dispatch File "email/generator.py", line 268, in _handle_multipart File "email/generator.py", line 410, in write UnicodeEncodeError: 'ascii' codec can't encode character '\ufffd' in position 5688: ordinal not in range(128) [9100] Failed to execute script 'gyb' due to unhandled exception!

The platform is MacOS 11.6.2 running on an M1 laptop. % ./gyb --version Got Your Back 1.55 https://git.io/gyb Jay Lee - jay0lee@gmail.com Python 3.10.2 64-bit final google-api-client 2.36.0 macOS-11.6.2-x86_64-i386-64bit x86_64 Path: /Volumes/GYB backups approx. 20220201/gyb-domain.org ConfigPath: /Volumes/GYB backups approx. 20220201/gyb-domain.org OpenSSL 3.0.1 14 Dec 2021 gmail.googleapis.com connects using TLSv1.3 TLS_AES_256_GCM_SHA384

Work-around:

  1. Rung gyb until it crashes.
  2. Remove the --cleanup options, run the restore again (briefly) to get by the error-causing message (there is presumably a way to identify the message and handle it manually)
  3. kill gyb, then
  4. run gyb again with the cleanup option turned on again.

Some messages may miss the cleanup, but the rest should be rstored correctly.

jay0lee commented 2 years ago

What command were you using?

On Wed, Feb 23, 2022, 12:29 AM zegor-mjol @.***> wrote:

It happened to me just now. A (large) GMail account was backed up, by GYB, and then to be restored under a different account. The error message is Traceback (most recent call last):15404) File "gyb.py", line 2532, in File "gyb.py", line 2007, in main File "gyb.py", line 1781, in message_hygiene File "email/message.py", line 178, in as_bytes File "email/generator.py", line 116, in flatten File "email/generator.py", line 181, in _write File "email/generator.py", line 218, in _dispatch File "email/generator.py", line 268, in _handle_multipart File "email/generator.py", line 410, in write UnicodeEncodeError: 'ascii' codec can't encode character '\ufffd' in position 5688: ordinal not in range(128) [9100] Failed to execute script 'gyb' due to unhandled exception!

— Reply to this email directly, view it on GitHub https://github.com/GAM-team/got-your-back/issues/325#issuecomment-1048461112, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDIZMFR2M6TNOAGDT4PD2LU4RWETANCNFSM5DVLBIZA . You are receiving this because you modified the open/close state.Message ID: @.***>

zegor-mjol commented 2 years ago

The command was (with personal info redacted):

./gyb --email [snip] --action restore --service-account --local-folder [snip] --cleanup --cleanup-date 'Thu, 1 Jan 1970 00:00:00 -0800' --cleanup-from 'GYB Restore gyb-restore@gyb-restore.local

Removing the —cleanup-related options allowed the restoration to proceed. After a few messages being restored, killing gyb and then invoking it with the —cleanup options again got us past the problem (until it recurred again a few thousand messages later). Leaving --cleanup without the -date and -from options did not resolve the issue.

Another —cleanup problem has occurred after the previous report. For the same command:

Traceback (most recent call last):11527)
File "gyb.py", line 2532, in File "gyb.py", line 2007, in main File "gyb.py", line 1750, in message_hygiene File "email/utils.py", line 200, in parsedate_to_datetime ValueError: Invalid date value or format "Fri, 9 Aug 2002" [12483] Failed to execute script 'gyb' due to unhandled exception!

In this case the date for the message was identified, so finding the message file and then doing a hand-edit of the offending date (manually adding a time stamp to the date) got us past the message.

Cheers,

Z