KnugiHK / WhatsApp-Chat-Exporter

A customizable Android and iOS/iPadOS WhatsApp database parser that will give you the history of your WhatsApp conversations in HTML and JSON. Android Backup Crypt12, Crypt14, Crypt15, and new schema supported.
https://wts.knugi.dev/
MIT License
518 stars 76 forks source link

Avatars in the html #48

Open kintaro1981 opened 1 year ago

kintaro1981 commented 1 year ago

Hello I noticed in the html source this:

<div class="w3-row">
    <div class="w3-col m2 l2"><img src="[WhatsApp/Avatars/120363023062520713@g.us.j](view-source:file:///Users/username/Downloads/whatsapp-chat-exporter/working_wts/result/WhatsApp/Avatars/120363023062520713@g.us.j)" onerror="this.style.display='none'"></div>
    <div class="w3-col m10 l10">
        <div style="text-align: left;">
        Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec sodales tincidunt leo vel tempus. Nunc justo nibh, dictum et lorem non, pulvinar consequat orci. Duis euismod hendrerit tortor, ut iaculis libero tempor id. Ut volutpat, nulla vitae interdum tristique, massa sapien luctus lectus, sed porttitor ex metus a dui.                  
        </div>
    </div>
</div>

But in my iOS db export I can't find the Avatars directory.

KnugiHK commented 1 year ago

I also have no idea on where is the avatars of iOS WhatsApp for now. Any comments on this are welcome.

KnugiHK commented 1 year ago

I looked again on the iOS backup and found where the avatars are stored! I will work towards including the avatars for iOS.

kintaro1981 commented 1 year ago

It's working but with some issues, I found 3 of them.

In groups it displays the group avatar instead of the other users' ones.

In 1 on 1 chats I noticed that in some cases it is displaying old avatars(cached in some way from whatsapp?) and not the latests. Maybe if the user have more than 1 profile images it use the first and not the latest?

Example: a user have two profile image: 393475555501-1375641367-1375641782.jpg (I it's think from a group where this user belong) 393475555501-1668360926.jpg (the actual wa avatar)

In 1 on 1 chat it display the first instead of the latest.

Another thing I noticed is that in some chats if I click on the avatar it display another picture (for example from a group where the user belong), .thumb and .jpeg are different. Example:

                <div class="w3-row">
                    <div class="w3-left pad-right-10 name">
                        Juan Juanito
                    </div>
                    <div class="w3-right-align blue">17:27</div>
                </div>
                <div class="w3-row">
                    <div class="w3-col m2 l2">
                            <a href="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/393395926709-1582015021-1645176973.jpg"><img src="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/393395926709-1450980713.thumb" onerror="this.style.display='none'" class="avatar"></a>
                    </div>
                    <div class="w3-col m10 l10">
                        <div class="w3-left-align">
                            Hello, ok, no problem!
                        </div>
                    </div>
                </div>
KnugiHK commented 1 year ago
  1. I also notice the issue of group chat. Maybe there will be another fix for that. Assuming no fix for that in the foreseeable future, would you prefer to (a) keep showing the group avatar or (b) not show the avatar in group chats?
  2. Could you try to replace the code in extract_iphone.py lines 43-47 with the following and see if the correct avatar is shown?
            for avatar in avatars:
                if avatar.endswith(".thumb") and data[content["ZCONTACTJID"]].their_avatar_thumb is None:
                    data[content["ZCONTACTJID"]].their_avatar_thumb = avatar
                elif avatar.endswith(".jpg") and data[content["ZCONTACTJID"]].their_avatar is None:
                    data[content["ZCONTACTJID"]].their_avatar = avatar
  3. Yes. It is possible that .thumb and .jpg are different. The current implementation is just iterate over all possible files. There is no way to tell if the image contents are the same between the thumbnail and the avatar. And I want to avoid adding unnecessary dependencies on the exporter.
kintaro1981 commented 1 year ago
  1. If it's not possible to display the other users' avatars I think it is better to keep the group avatar at least.
  2. and 3. This solved but only in some cases. I noticed this:

In the next case, for 1 on 1 chats of 393381961111 you can choose from this:

393381961111-1362753161-1362753674.thumb <= it's a GROUP avatar
393381961111-1485804964-1485805039.jpg <= it's a GROUP avatar
393381961111-1485804964-1485805039.thumb <= it's a GROUP avatar
393381961111-1666265085.jpg <= This is the right one for 1on1 chat

The right one is: 393381961111-1666265085.jpg

The .thumb doesn't exists and wtsexporter wrongly choose to use:

393381960328-1362753161-1362753674.thumb as avatar and 393381960328-1485804964-1485805039.jpg as link to the avatar.

The next case is similar to the previous one but the .thumb exists:

393469510625-1544247670-1544248176.thumb <= it's a GROUP avatar
393469510625-1544247670-1544248176.jpg <= it's a GROUP avatar
393469510625-1528113443.thumb  <= This is the right one for 1on1 chat
393469510625-1514889781-1514889782.jpg <= it's a GROUP avatar

but wtsexporter choose again a group image:

393469510625-1544247670-1544248176.thumb as avatar 393469510625-1544247670-1544248176.jpg as link to the avatar

A more "complicate" one:


393471412345-1354147243-1390210230.jpg <= it's a GROUP avatar
393471412345-1354147243-1390210230.thumb <= it's a GROUP avatar
393471412345-1357809601-1384804167.jpg <= it's a GROUP avatar
393471412345-1357809601-1384804167.thumb <= it's a GROUP avatar
393471412345-1359446549-1426112043.jpg <= it's a GROUP avatar
393471412345-1359446549-1426112043.thumb <= it's a GROUP avatar
393471412345-1371244762-1371534399.thumb <= it's a GROUP avatar
393471412345-1375165184-1375479277.jpg <= it's a GROUP avatar
393471412345-1375165184-1375479277.thumb <= it's a GROUP avatar
393471412345-1441891779.jpg  <= it's an OLD 1on1 CHAT avatar
393471412345-1444585598-1444585600.jpg <= it's a GROUP avatar
393471412345-1444585598-1444585600.thumb <= it's a GROUP avatar
393471412345-1463548569-1463548570.jpg <= it's a GROUP avatar
393471412345-1463548569-1463548570.thumb <= it's a GROUP avatar
393471412345-1463548569-393471412345-1463548569.jpg <= it's a GROUP avatar
393471412345-1471411256-393471412345-1471411256.jpg <= it's a GROUP avatar
393471412345-1471411257-1471411258.jpg <= it's a GROUP avatar
393471412345-1483696658.jpg  <= it's an OLD 1on1 CHAT avatar
393471412345-1485431234-1485431235.jpg <= it's a GROUP avatar
393471412345-1485431234-1485431235.thumb <= it's a GROUP avatar
393471412345-1485431234-393471412345-1485431234.jpg <= it's a GROUP avatar
393471412345-1494941749.jpg  <= it's an OLD 1on1 CHAT avatar
393471412345-1508585140-393471412345-1508585140.jpg <= it's a GROUP avatar
393471412345-1532665103-1532665104.jpg <= it's a GROUP avatar
393471412345-1532665103-1532665321.jpg <= it's a GROUP avatar
393471412345-1532665103-1532665351.jpg <= it's a GROUP avatar
393471412345-1532665103-393471412345-1532665103.jpg
393471412345-1534514800-1534514816.jpg <= it's a GROUP avatar
393471412345-1534514800-1534514816.thumb <= it's a GROUP avatar
393471412345-1537370945.jpg  <= it's an OLD 1on1 CHAT avatar
393471412345-1543044662-1543044858.jpg <= it's a GROUP avatar
393471412345-1543044662-1543044858.thumb <= it's a GROUP avatar
393471412345-1547723910-1547723913.jpg <= it's a GROUP avatar
393471412345-1547723910-1547723913.thumb <= it's a GROUP avatar
393471412345-1565509357.jpg  <= it's an OLD 1on1 CHAT avatar
393471412345-1565509357.thumb  <= it's an OLD 1on1 CHAT avatar
393471412345-1570200050-393471412345-1570200050.jpg <= it's a GROUP avatar
393471412345-1570200071-1570200072.jpg <= it's a GROUP avatar
393471412345-1570200071-1570200072.thumb <= it's a GROUP avatar
393471412345-1581347010.jpg   <= THE WINNER!!! This is the right one for 1on1 chat, the last in alphabetical(?) order without multiple ids

In this case for 1 on 1 chats you have to choose 393471412345-1581347010.jpg

I think that the rule is:

Display in 1 on 1 chats the last in alphabetical order .jpg or .thumb avoiding the ones with multiples "ids" (I don't know how to call them) like: phonenumber-ID-ID.jpg|thumb or phonenumber-ID-phonenumber-ID.jpg|thumb

If you need a thumb and there's only the jpg, you need to convert it... or use jpeg directly.

KnugiHK commented 1 year ago

Could the format be like <group creator/PM's contact number>-<group creation timestamp (for group only)>-<avatar timestamp>?

LoSunny commented 7 months ago

There seem to be a bug with avatars in iOS group chat (dev branch & 0.9.5), where all the avatars are the same https://imgur.com/a/JYlGsP5 When I check the html, it all produce the same innerHTML for everyone

<div class="w3-col m2 l2">
    <a href="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/aaaa-bbb.jpg">
        <img src="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/aaaa-bbb.jpg" onerror="this.style.display='none'" class="avatar">
    </a>
</div>
KnugiHK commented 7 months ago

There seem to be a bug with avatars in iOS group chat (dev branch & 0.9.5), where all the avatars are the same https://imgur.com/a/JYlGsP5 When I check the html, it all produce the same innerHTML for everyone

<div class="w3-col m2 l2">
    <a href="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/aaaa-bbb.jpg">
        <img src="AppDomainGroup-group.net.whatsapp.WhatsApp.shared/Media/Profile/aaaa-bbb.jpg" onerror="this.style.display='none'" class="avatar">
    </a>
</div>

Are the avatars the group's avatar?

LoSunny commented 7 months ago

Are the avatars the group's avatar?

Yes they are

KnugiHK commented 7 months ago

Are the avatars the group's avatar?

Yes they are

Then it will be considered as a feature request rather than a bug, since it is expected behaviour.

KnugiHK commented 7 months ago

To ensure individual avatars for each party in a group chat, we must incorporate some identifier (such as a class name or relative path) on a per-message basis. If this isn't feasible, we may consider omitting avatars in the group chat to prevent confusion.

PS. I did a quick check, Telegram export does not include avatar.