I really appreciate the content y'all have added on working with Ubuntu and WSL! I have been making it up as I go along and I've cobbled together some patterns that work okay but I've learned a lot from these tutorials that I couldn't suss out on my own.
I just wanted to point out that there's a bug in the python code below the line "Copy this to the first cell, adapting the input directory:"
The screenshot has correctly-indented code, but the copy-able code in the code block mistakenly indents these two lines:
This leads to counting every mime_type exactly once. This has an impact later-on when the aggregate data is then filtered by frequency; every mime_type is filtered out because none of them could possibly be greater than 5.
From the bug report: