joshhighet / ransomwatch

the transparent ransomware claim tracker 🥷🏼🧅🖥️
https://ransomwatch.telemetry.ltd
The Unlicense
924 stars 141 forks source link

replace parsers using alnum #17

Closed joshhighet closed 2 years ago

joshhighet commented 2 years ago

when using alnum content can easily be missed. when using [0-9A-Za-z], the following would pass Advanced Micro Devices but with the added comma the following would fail to be picked up Advanced Micro Devices, Inc

ref grep/manual/bracketexpressions

./assets/parsers.sh -p | grep egrep
egrep -o 'class="title">([[:alnum:]]| |\.)+</a>' source/groove-*.html | cut -d '>' -f2 | cut -d '<' -f 1
egrep -o '<span style="font-size:20px;">([[:alnum:]]| |\.)+</span>' source/kelvinsecurity-*.html | cut -d '>' -f 2 | cut -d '<' -f 1
egrep -o 'fqd.onion/\?id=([[:alnum:]]| |\.)+"' source/blackbasta-*.html | cut -d = -f 2 | cut -d '"' -f 1 | sed -e 's/^ *//g' -e 's/[[:space:]]*$//'
egrep -o 'class="cls_recordTop"><p>([[:alnum:]]| |\.)+</p>' source/ransomhouse-*.html | cut -d '>' -f 3 | cut -d '<' -f 1