joshhighet / ransomwatch

the transparent ransomware claim tracker 🥷🏼🧅🖥️
https://ransomwatch.telemetry.ltd
The Unlicense
904 stars 135 forks source link

Captcha support #75

Closed krautsource closed 1 year ago

krautsource commented 1 year ago

Hi, As discussed in #69 , cl0p (and others) added a captcha to their blog and ransomwatch thus can't scrape it anymore. It wouldn't surprise me if more ransomware groups were going to add captchas to their sites, so this issue may become more pressing with time. I was wondering if we could work around this, either by using a commercial captcha solving service or with the help of volunteers.

Here's what I'm thinking:

As a completely different approach, I was also wondering if cl0p and others would be willing to supply a current victim list using some kind of API or at least on a separate page without a captcha. They want people to know about their successes, don't they? So maybe they'd be willing to cooperate on this. However, actively approaching a ransomware gang could become a moral or even a legal dilemma for the ransomwatch project.

Any thoughts on this?

joshhighet commented 1 year ago

hey, thanks for the ideas

yep - a number of groups have implemented various scraping countermeasures over the ~3 years this project has been running. five, to be exact

curl -sL ransomwhat.telemetry.ltd/groups | jq -r '.[] | select(.captcha==true) | .name'
avoslocker
grief
clop
doppelpaymer
entropy

the majority of groups that have implemented CAPTCHA's have done so in ways that can already be trivially bypassed, though incorporating these shortcomings into ransomwatch would be a net loss as far as I am concerned. given the open codebase, if a group has shown design choices that clearly suggest the detterance of scrapers, i've not added parsers to avoid inevitable games of cat and mouse.

ransomwatch at the core is a very simple framework. to incorporate post-dom-rendering actions and leverage 3rd party API's mid-flight would require an extensive rework to the current implementation, something I see more fit in a wholly new rewrite. this is something I'm open to if others are open to engagement, though I've ultimately parked a number of new features in the current model to avoid turning it into a frankenstein.

ransomwatch is a net loss for me financially - i'm happy to continue supporting it, though I would object the use of any commercial API's under the current model. there are certainly solver providers in this space that can tackle out-of-norm tests such as what you're accustomed to dealing with when viewing avoslocker. i don't feel right profiting from ransomwatch - though some kind of cost model would inevitably have to be introduced to go down this path. in full transparency, through two kind one-time GitHub sponsors over the past year, I have recovered about 40% of the operational costs to service this platform

I like your idea of offloading the solving to the visitor when required, a volunteer model certainly seems to be the only sustainable pathway. though implementing this would require changes in the infrastructure and threat model I have for this service. accepting user input and the use of server-side rendering has all been carefully avoided to date, and again I think such changes would be better suited towards a more efficient rewrite

you are right - morally, the ground gets shaky. I formed ransomwatch after the surge of DLS's that arose after Maze really kicked things off - in a time where it became unsustainable to visit, track and gain situational awareness across such a large and growing estate of web properties, where the only ones doing it were select CTI firms shrouded in secrecy, producing claims and reports that were simply un-attestable at the time.

a number of groups surface machine-consumable feeds through either JSON, XML or RSS - where these exist, I default to using them today. i have (and continue to) refuse direct offers from certain groups that wish to provide me with alternate locations or feeds to be included within ransomwatch

this speaks to some clear boundaries I have with this project

never being the first to broadcast a new service and/or group never introducing non-publicly accessible information into the ransomwatch dataset

I believe ransomwatch is a beneficial utility to a number of industries and verticals - and I'd love it to remain that way whilst still being able to evolve as do our sources. That said, I believe there is a very fine line between what this project is today, the transparency and accountability it provides and an extortion amplification platform, something I'd hate for this to ever be compared to.

-j