Closed lpmi-13 closed 2 years ago
One potential task would be to parse N number of log files and get the average request count for each IP address.
...or top 10 noisiest IPs or ones that only show up once, or whatever
Another interesting use case would be log all network requests when accessing YouTube videos, output those to logs, and try to process/filter them so we only have a list of domains that serve ads, a la pihole block list.
actually...absent an insanely large log file...this might as well just be done locally instead of in gitpod, since there's no real benefit to doing this in a remote environment (unless you normally work in windows and want some linux command line practice).
Gonna close this as done, since it took me about 20 minutes to create https://github.com/lpmi-13/sadpods-logsearch
We haven't added one of these in a while, so I thought I'd add one while I'm thinking of it...
There are lots of useful commands to parse logs (eg, grep, awk, etc), and it might be nice to have a disposable environment to practice them in. So I'm going to make a micromaterial intended to be run in gitpods (but also possible locally, just requiring a clone of several hundred megabytes of data) where you can practice finding various things by using these command line tools.
There are lots of free data sets at kaggle: https://www.kaggle.com/datasets?search=server+logs
The plan is to grab that data and see what we can find! Half exploratory data analysis, half treasure hunt!