filippz / telegram_import

Import back messages exported from Telegram into JSON format
14 stars 2 forks source link

getting error #2

Closed lorvent closed 9 months ago

lorvent commented 10 months ago

Hello, I am trying to use this but always getting this error.

Traceback (most recent call last):
  File "C:\laragon\bin\python\python-3.6.1\lib\site-packages\pandas\core\indexes\base.py", line 2898, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas\_libs\index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas\_libs\hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas\_libs\hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'media_type'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "telegram_import.py", line 241, in <module>
    test_only=args.test_only
  File "telegram_import.py", line 177, in import_history
    messages, file_list = convert_to_whatsapp_format(data, only_first_n_messages)
  File "telegram_import.py", line 44, in convert_to_whatsapp_format
    media_type = row["media_type"]
  File "C:\laragon\bin\python\python-3.6.1\lib\site-packages\pandas\core\series.py", line 882, in __getitem__
    return self._get_value(key)
  File "C:\laragon\bin\python\python-3.6.1\lib\site-packages\pandas\core\series.py", line 990, in _get_value
    loc = self.index.get_loc(label)
  File "C:\laragon\bin\python\python-3.6.1\lib\site-packages\pandas\core\indexes\base.py", line 2900, in get_loc
    raise KeyError(key) from err
KeyError: 'media_type'

can you please help?

thanks.

filippz commented 10 months ago

The error could be due to the the fact that the chat simply doesn't have any "non text" messages - then the resulting result.json wouldn't have media_type key defined. Can you check if you have actually exported media/files by selecting everything in Media export settings and with maximum size as noted in README.md?

lorvent commented 9 months ago

Hello, sorry for delay in response.

my total backup size is more than 600MB :) which includes images, videos, audio files, gifs, stickers etc.

so definitely all kind of media types are included.

can you please check if it is an issue with anything else?

filippz commented 9 months ago

I've exported one of my chats and tried importing it again - no problems have been encountered. I guess you could be having one or more messages that are somehow specific - I would suggest to add the following line just after for index, row in df.iterrows(): and before is_photo = False in convert_to_whatsapp_format function:

print("Processing message id: {id}".format(id=row["id"]))

This should print the message id as they are processed and should point us to the message that breaks the import process. Please try to find the offending message by looking for that id in the result.json - if you do find it then please remove any sensitive information from it (like text) and paste it here (staring with { "id": and ending with },) so I can try to figure out what's wrong with the code. Typically it looks something like this:

{
    "date": "SomeDate",
    "date_unixtime": "SomeUnixDate",
    "from": "Someone",
    "from_id": "SomeUserId",
    "id": SomeMessageId,
    "text": "SomeText",
    "text_entities": [
        {
            "text": "SomeText",
            "type": "plain"
        }
    ],
    "type": "message"
}
lorvent commented 9 months ago

Hello, Now actually that problem is gone, (i exported new data to test)

and now i getting another error

Processing message id: 372626
Staring import
Upload files mentioned in messages
Uploading (File unavailable, please try again later):   2%|▊                                     | 1/44 [00:01<00:58,  1.35s/it][WinError 2] The system cannot find the file specified: 'C:\\laragon\\www\\telegram_import\\ChatExport_2023-11-30\\(File unavailable, please try again later)'
Uploading (File unavailable, please try again later):   2%|▊                                     | 1/44 [00:01<00:59,  1.38s/it]

also i have another doubt, if i export user1's data, can i export it to user2? like does telegram checks in anyway if i am exporting to same user or not?

thanks.

filippz commented 9 months ago

Uploading (File unavailable, please try again later): 2%|▊ | 1/44 [00:01<00:58, 1.35s/it][WinError 2] The system cannot find the file specified: 'C:\\laragon\\www\\telegram_import\\ChatExport_2023-11-30\\(File unavailable, please try again later)'

The new error seems odd - as if Telegram Client wasn't able to download the file from the Telegram servers without any specific reason. I guess you'll need to try exporting it again with the same settings and see if it behaves differently. Also, you could take a look at the result.json and see if the problem is just that one file or there are more of them. Depending on that you could decide to skip the problematic message/messages if you're OK with that.

also i have another doubt, if i export user1's data, can i export it to user2?

If you're asking if you about importing - then yes you can. We have to explicitly specify chat to which you're importing by using --peer as there is no way to figure out the actual Telegram user (if any) that the chat being imported is tied to by looking at the WhatsApp format we're emulating.

lorvent commented 9 months ago

this export is new, i just did a new export and tried then it thrown above error.

my doubt is...it is trying for images in root folder, where as images are in photos folder. ( I am saying because i exported a whatsapp chat and it puts txt file and all images in same folder) is that a reason?

thanks.

filippz commented 9 months ago

If you have looked at the result.json you would notice that filenames are pointing to the right subfolders - for example: "photo": "**photos**/photo_Number@IMageDateAndTime.jpg". My guess is that some of the files mentioned in the messages were not downloaded as they should and that instead of the actual subfolder/filename.extension entry in result.json you have (File unavailable, please try again later). Import is then trying to upload the file named (File unavailable, please try again later) and fails. At that point yo can either:

lorvent commented 9 months ago

Man.....awesomeeee 🥳 that worked, i skipped few videos while exporting and thats why the error.

thank you so much.

now i have a out of this repo related (kinda weird) request.

i exported a huge dump of 750MB chat but i selected as HTML and i deleted chat then, is there any way to convert that html to json format? OR is there any chance of you supporting html format?

thank you so much again, you can't imagine how excited i am now....because of that chat deletion, i had a big war (which you can imagine ;) )

filippz commented 9 months ago

I'm glad that you got it working.

It was never my intention to support HTML format as JSON is much simpler to parse. I understand that you're now stuck with HTML only which is not something I would consider an usual situation (as one usually could simply repeat the export to JSON). You could try to convert the HTML to JSON by using something like https://github.com/KanegaeGabriel/telegram-export-converter which generates CSV and then converting the resulting CSV to JSON file similar to result.json. You could even trying to adapt the mentioned tool to generate JSON directly instead of the CVS or try searching for some other more suitable script/tools instead.

lorvent commented 9 months ago

Ok.... I understand.

when i exported n then deleted the chat....i was not aware of JSON format availability too.

thank you so much for this as always ❤️

https://github.com/realdeveloperongithub/chat-history-organizer found this long back which supports html but it is throwing errors and dev is not active it seems.

also i assume there will be python scripts which can convert html to json....but i doubt whether they follow desired format or not.

lorvent commented 9 months ago

btw, i have another doubt.... does your package supports only result.json or result1.json etc files too?

filippz commented 9 months ago

I newer saw an export with anything but single result.json - nonetheless I would expect that you can simply append messages array in result.json with messages from result1.json, result2.json... While looking for a solution for HTML to Telegram you might take a look for something that can generate WhatsApp format from HTML as Telegram supports importing it directly and skip using telegram_import altogether (which only converts Telegram JSON export to WhatsApp on the fly so it can be imported back to Telegram). You might able to find some code that supports exporting chats from Telegram HTML to WhatsApp - but you would then just import the resulting file back to Telegram instead.

lorvent commented 9 months ago

ok thanks...i will try.

i have few plans and i will try those.

thanks for your time.

since yours (out of many other repos) successfully importing json to telegram...i want to convert to json first so that i can rely on your program to import into tg.

lorvent commented 9 months ago

btw, i tried a new export with both formats so it created 13 html files but only one json file.

where as my original export has 141html files.

lorvent commented 9 months ago

Also anychance can i connect with you over telegram? easier to discuss....

lorvent commented 9 months ago

btw....i found a small bug kind of...

when your script is importing, it is not considering "reply_to_message_id"

i.e. if we respond to a message, it should include that message aswell but while importing, your script ignoring that.

an example below

Original chat image

image

How it looks in imported chat

image

image

is it possible to add that aswell? does import script/api has that feature?

thanks.

filippz commented 9 months ago

Telegram exported JSON is quite detailed but the whole reason this code exists is that Telegram doesn't import it's own format so we're converting that JSON to WhatsApp text format that can be imported into Telegram. Since WhatsApp text format is rudimental a lot of fields included in Telegram JSON simply can't be used in the conversion - a few notes are here: https://github.com/filippz/telegram_import/blob/cfb7fe699a2e039ef796bc0091218e1b515866b1/telegram_import.py#L20

I'm not a WhatsApp user so I don't know if WhatsApp even supports the reply to specific messages and if so does it somehow include that information in exported chats. Also, https://core.telegram.org/api/import doesn't offer any details on what types of chat files can be imported let alone the details of how are they being parsed...

lorvent commented 9 months ago

Hello, I use whatsapp a lot and if you want, i can send a whatsapp export containing replies to responses.

will that give any clue?

filippz commented 9 months ago

Possibly - you can simply copy/paste just the part containing the first message and the reply to that specific message. If we can figure out if and how the second message references the first one we can then implement it into the code.

lorvent commented 9 months ago

I just tried importing a whatsapp chat to telegram and it didn't include message references and in txt file also, there are no references.

but one big difference i found between your import and whatsapp is that.... your import puts chats in both sides (left n right) depending on whose message is that. where as whastapp import puts all messages into left side only.

so i thought you might be using import api and not whatsapp way...

so we are out of luck here may be...