purarue / google_takeout_parser

A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)
https://pypi.org/project/google-takeout-parser/
MIT License
82 stars 14 forks source link

parse_html.activity: about 30% speedup for html parsing #66

Closed karlicoss closed 1 month ago

karlicoss commented 1 month ago

I've been reviewing some older takeouts, so had cachew off and parsing was a bit painful... so had a quick look in a profiler and did some optimization

A few optimizations

Measurements on a big Chrome/MyActivity.html file

purarue commented 1 month ago

thanks!

will parse against my old takeouts just to make sure nothing breaks and merge in a bit

karlicoss commented 1 month ago

I was checking against quite old ones, so hopefully should be ok, but thanks for double checking :)