bdoms / tumblr_backup

A Python script for saving your Tumblr blog to your hard drive as HTML or CSV.
89 stars 23 forks source link

Unicode error when backing up #2

Closed zacs closed 13 years ago

zacs commented 13 years ago

Hi, I'm trying to backup my blog and running into the following error. Looks like a normal Unicode formatting error. I'll try forking it and changing it to run .encode() to convert to utf-8 prior to the f.write or writer.writerow call...


zac@hosaka ../tumblrBackup $ python tumblr_backup.py zaschell
Getting basic information.
Getting posts 0 to 49.
Traceback (most recent call last):
  File "tumblr_backup.py", line 275, in <module>
    backup(account, use_csv)
  File "tumblr_backup.py", line 255, in backup
    savePost(post, save_folder, header=header)
  File "tumblr_backup.py", line 129, in savePost
    f.write("<blockquote>" + quote + "</blockquote>")
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 42: ordinal not in range(128)
zac@hosaka ../tumblrBackup $ python tumblr_backup.py --csv=true zaschell
CSV mode activated.
Data will be saved to zaschell/zaschell.csv
Getting basic information.
Getting posts 0 to 49.
Traceback (most recent call last):
  File "tumblr_backup.py", line 275, in <module>
    backup(account, use_csv)
  File "tumblr_backup.py", line 253, in backup
    savePost(post, save_folder, use_csv=use_csv, save_file=save_file)
  File "tumblr_backup.py", line 190, in savePost
    writer.writerow(row)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 30: ordinal not in range(128)
bdoms commented 13 years ago

Yeah, that's a problem. I swear I'm usually better about catching Unicode bugs, but I guess I just never encountered them with Tumblr until now.

Anyway, it should be fixed in eca52973247491b857ba6e2d71f22147ec134d9e - I even added in a meta tag to the HTML with the charset so that everything continues to display correctly in browser.