sleuthkit / autopsy

Autopsy® is a digital forensics platform and graphical interface to The Sleuth Kit® and other digital forensics tools. It can be used by law enforcement, military, and corporate examiners to investigate what happened on a computer. You can even use it to recover photos from your camera's memory card.
http://www.sleuthkit.org/autopsy/
2.41k stars 597 forks source link

Searching for BTC-addresses with a regular expression #7179

Closed alex-gehrig closed 3 years ago

alex-gehrig commented 3 years ago

Hello altogether, I am actually trying to find BTC-addresses through a keyword-search in Autopsy 4.19.0 using regular expressions. As I created the image I am working with on myself, I know there is a txt-document containing only a BTC-address.

The regex I am using is: [13][a-km-zA-HJ-NP-Z1-9]{25,34}

I know that this regex is working for legacy addresses, but when used in Autopsy the program tells me that there will be too many hits and will probably crash after some time or become at least unresponsive.

After reading this (https://github.com/sleuthkit/autopsy/issues/3238) I changed the regex to: \[13\]\[a\-km\-zA\-HJ\-NP\-Z1\-9\]\{25,34\}

Now the search finishes in a normal amount of time but there is no result. After doing some searching on the net I was not able to find a regex for legacy-addresses that works with Autopsy...

I also get no results when I add ".\?" at the beginning and the ending.

Does anyone have a hint for me?

Edit: had to update my text, because the "\" were not displayed...

alex-gehrig commented 3 years ago

Obviously I am either pretty alone with my problem or nobody stumbled upon my question...

In the meanwhile I was luckily able to find some kind of solution respectively a reason for the problems. When Autopsy builds it's index using Apache Solr, the tokenizer seems to convert everything to lowercase letters. While this is basically good for an index, it is bad when you want to search for a Legacy-address from the cryptocurrency Bitcoin, which is case sensitive.

So I changed my RegEx finally to this: [^a-z0-9][13][a-z1-9]{25,34}[^a-z0-9]

In this way Autopsy does not crash and I am getting results concerning everything that could be more or less a Legacy-address and that has no letters or numbers neither before nor after the string I am mainly looking for. Unfortunately this produces a huge amount of false positives. But this can be filtered down.

As my initial question is answered I will close this issue with this comment. If someone has more information about this topic or even some good ideas I am of course still interested!

lfcnassif commented 3 years ago

This open source project searches for BTC, BTC Cash, Monero, Ripple, Ethereum, Dogecoin, Dash, Litecoin addresses and BTC private keys and performs checksum validation, maybe it could help: https://github.com/sepinf-inc/IPED