telstrapurple / MSTeamsChatExporter

Code that helps an end-user export their Microsoft Teams Chat History to HTML
MIT License
45 stars 10 forks source link

Reduce .html size by avoiding duplicating image data #6

Open PeterMinin opened 1 year ago

PeterMinin commented 1 year ago

Currently the final .html files include all images as inline base64 data. The problem is, profile photos are typically repeated many times within a chat, and the data is duplicated every time. This makes the files unnecessarily large.

Here is an idea of how to reuse the same data using CSS: https://stackoverflow.com/a/55762229/675674. Note that SVG, which is also used there, is avoidable too: see the comment. Other answers provide other CSS solutions and a JS one.

PeterMinin commented 1 year ago

I made a Python script to post-process the files: https://gist.github.com/PeterMinin/20b7bebbda578db7cad24a6487d81d7c. Well, for me it saved 10%, 305 MB -> 272 MB. Not very much, especially if you compress your backup. But some of my files reduced by ~70%, like 490 KB -> 128 KB, so YMMV.