alephdata / aleph

Search and browse documents and data; find the people and companies you look for.
http://docs.aleph.occrp.org
MIT License
2.04k stars 272 forks source link

Extract crypto wallet addresses #3907

Open tillprochaska opened 1 year ago

tillprochaska commented 1 year ago

ingest-file could extract crypto wallet addresses for popular crypto currencies using regular expressions, similar to it already extracts email addresses and IBANs.

While ElasticSearch and Aleph do support searching using regexes which can be used to find mentions of such addresses, ElasticSearch’s regex capabilities are limited, e.g. a regex must always match a full token. It can be difficult or impossible to come up with a valid ES regex that matches valid addresses and is precise at the same time.

Rosencrantz commented 1 year ago

@pudo I have vague memories of chatting about this with you. Is there a possibility of a clash here, too many false negatives, that sort of thing?