t413 / SMS-Tools

Import / Export / Merge tool for your Android/iOS/GV text message history.
t413.com/SMS-Tools
135 stars 52 forks source link

Unicode encoding for sms #14

Open FilLupin opened 9 years ago

FilLupin commented 9 years ago

Hi, thank you first for your script, a script like this one should be very helpfull to me.

Trying to convert my ios6 sms.db to android, I get this output when I launch "./bin/smstools --type android sms.db mmssms.db" :

Traceback (most recent call last): File "./bin/smstools", line 59, in print " " + term.blue(smstools.truncate(new_texts[-1].body)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 42: ordinal not in range(128)

Some sms contains french characters like accents.

FilLupin commented 9 years ago

Because it is some unicode issue, some .encode("utf-8") should solve this. After looking into the code, it seems, in addition to changing line 59 of ./bin/smstools into

print " " + term.blue(smstools.truncate(new_texts[-1].body.encode("utf-8"))) texts.extend(new_texts)

we should change line 25 of ./smstools/android.py into

txt = core.Text(num=row[0],date=long(row[1]),incoming=(row[2]==2),body=row[3].encode("utf-8")). I make some tries but I do not know very well the android database format so I am not sure how to interpret my results...

t413 commented 9 years ago

It looks like from your error that the problem is actually with the debug printing to the terminal. Try taking out File "./bin/smstools", line 59 all together and running it again.

FilLupin commented 9 years ago

Yes, this is another solution, but I think the first part of my solution will keep the informations of your script which is interesting because it allows to check that the script tooks into account the last sms. I am not sure about the whole script is ok with this only modification because I did not succed to push my sms into my replicant encrypted phone, I am now looking for the exact way to do this. Thanks anyway for your answer and your script.

FilLupin commented 9 years ago

Up because I noted that my messages have been hidden before by github.

FilLupin commented 9 years ago

Perhaps is it due to mmssms.db-journal which is included into the basic android (without any sms) and not generated by the script...

FilLupin commented 9 years ago

I think I understand (one of) the issue(s) : a date_sent field should exist in tables sms and pdu (it exists in empty mmssms.db files but not into the files generated by the smstools script).

t413 commented 9 years ago

Can you test to see if including that column in the database will make the phone happy? I don't have any working android phones at the moment.

FilLupin commented 9 years ago

Yes, I tried to understand where SQL request on pdu and sms tables have to be modified. I have done some modifications (creation/insertion/update of same content into fields date and date_sent) but they does not seem to actually succeed, generated pdu and sms tables do not include date_sent fields, probably because I am not sufficiently expert in python language.

FilLupin commented 9 years ago

mmssms.db does not seems to be accepted by android, even by adding date_sent fields to sms and pdu tables. Generated structure seems to be consistent with initial db given with my phone (it seems to depends on the phone http://az4n6.blogspot.de/2013/02/finding-and-reverse-engineering-deleted_1865.html).

Perhaps is it because only some tables (android_metadata, sms,threads, and canonical_addresses) are populated... Do you have any doc of the android and ios db structure ?

FilLupin commented 9 years ago

I can push my modifications if you want, but they do not allow the mmssms.db generated to be recognized, just let me know...

fyears commented 9 years ago

Maybe I have similar issue. I have some international text, a.k.a non ascii characters in my messages.

After I try smstools --type json ios-sms.db output.json, I got something like:

[
    {
        "body": "\u041b",
        "chatroom": null, 
        "date": 12345, 
        "incoming": true, 
        "members": null, 
        "num": "+11234567890"
    }
]

which is python-specific (and incorrect).

I suggest saving all the non ascii characters as UTF-8, like this:

[
    {
        "body": "Л",
        "chatroom": null, 
        "date": 12345, 
        "incoming": true, 
        "members": null, 
        "num": "+11234567890"
    }
]
varunpalekar commented 7 years ago

Just add below lines in starting of smstools main file basically at /usr/local/bin/smstools:

import sys
reload(sys)
sys.setdefaultencoding('utf-8')