purarue / google_takeout_parser

A library/CLI tool to parse data out of your Google Takeout (History, Activity, Youtube, Locations, etc...)
https://pypi.org/project/google-takeout-parser/
MIT License
82 stars 14 forks source link

parse_csv: fix parsing youtube comment for newer takeout exports #79

Closed karlicoss closed 4 weeks ago

karlicoss commented 4 weeks ago

noticed some errors in promnesia now that new comments started being extracted after my previous fix

of course, comments are comma-separated jsons with one like per each json :clown_face: If you have any good ideas how to parse it nicer, let me know! Sadly, stdlib json parser doesn't support some sort of streaming mode :(

karlicoss commented 4 weeks ago

Thought I had a bug for a second, but turned out to be fine, so reopened!

purarue commented 4 weeks ago

of course, comments are comma-separated jsons with one like per each json

uh, this fix is probably fine, I just cant even visualize what that means...? Could you post/link me to example data just so I understand

karlicoss commented 4 weeks ago

Ah, you can see the example here! https://github.com/seanbreckenridge/google_takeout_parser/pull/79/files#diff-8789098ed2a3a26eb03a72c01b6432558382e227eb4cc9bc3cbd50f570f9857bR80

Something like this: contentJSON='{"text":"> I am about to get buried in the concrete"},{"text":"\n"},{"text":"the most normal Veritasium video!"}'

purarue commented 4 weeks ago

bleh... I don't why they just cant stick to some standard format...

I guess they have to innovate™

thanks for the fix :+1:

purarue commented 4 weeks ago

feel free to ping me if you want me to do a new release (though I dont think this impacts other CIs)

karlicoss commented 4 weeks ago

Yeah, it's dynamically called anyway, so no need for new release. Thanks!