joshhighet / ransomwatch

the transparent ransomware claim tracker 🥷🏼🧅🖥️
https://ransomwatch.telemetry.ltd
The Unlicense
924 stars 141 forks source link

parsing feedback #83

Closed joshhighet closed 11 months ago

joshhighet commented 1 year ago

an always open issue to track target site redesigns and parsing faults

joshhighet commented 1 year ago

nb: will avoid using the text as this is changed dependent on success of a claim and will lead to partial-duplicate entries

grep --no-filename '<a href="/posts/' source/ragroup-pa32ymaeu62yo5th5mraikgw5fcvznnsiiwti42carjliarodltmqcqd.html
                    <a href="/posts/z/">Z****ta</a>
                    <a href="/posts/p/">P****X</a>
                    <a href="/posts/decimal-point-analytics-pvt/">Decimal Point Analytics Pvt(Unpaid)</a>
                    <a href="/posts/bluelinea/">Bluelinea(Unpaid)</a>
                    <a href="/posts/thaire/">Thaire(Unpaid)</a>
                    <a href="/posts/deepnoid/">Deepnoid(Unpaid)</a>
                    <a href="/posts/eastern-media-international-corporation/">Eastern Media International Corporation(Unpaid)</a>
                    <a href="/posts/eyegene/">EyeGene   (Unpaid)</a>
                    <a href="/posts/bisco-industries/">Bisco Industries(Unpaid)</a>
                    <a href="/posts/wealth-enhancement-group/">Wealth Enhancement Group(Unpaid)</a>
                    <a href="/posts/insurance-providers-group/">Insurance Providers Group(Unpaid)</a>
joshhighet commented 11 months ago

mallox failing to extract with python subprocess? strange.

joshsmbpa:ransomwatch (main*) $ ./ransomwatch.py parse
2023-12-01:00:10:50,873 INFO     parser: mallox
2023-12-01:00:10:50,873 INFO     sharedutils: running shell command - 
    sed -n '/fs-3 fw-bold text-gray-900 mb-2/{n;s/^[[:space:]]*//;s/[[:space:]]*<\/div>.*$//p;}' source/mallox-*.html

2023-12-01:00:10:51,133 INFO     ransomwatch: parse run complete

ransomwatch (main*) $ sed -n '/fs-3 fw-bold text-gray-900 mb-2/{n;s/^[[:space:]]*//;s/[[:space:]]*<\/div>.*$//p;}' source/mallox-*.html \
| wc -l
      30