internet-sicherheit / ethereum-cache-creator

GNU General Public License v3.0
0 stars 0 forks source link

Detect faucet fraud by using transaction graph analysis #16

Open kiview opened 4 years ago

kiview commented 4 years ago

Bloxberg provides the Faucet app in order to acquire free bergs. Although the app is secured by a CAPTCHA, it seems there was someone (or something) exploiting the faucet to obtain a big amount of bergs over time, thereby indirectly acquiring bloxberg resources.

I have the hypothesis that we can identify and detect this behaviour in the transaction graph. We might need to look at events over time though in addition to a static analysis, and determine additional dynamic values, such as faucet usage rate.

This could later be extended to a general anomaly detection.

ghost commented 4 years ago

Kevin: I think extracting all blocks will take more than 1 day, so I will upload it tomorrow to the nextcloud

Re graph analysis: igraph works fine and you can find the example R code for igraph in the repo, however I think it is easier for everyone if you use Python based NetworkX in some Jupyter notebooks

ghost commented 4 years ago

How did you realize that the faucet was being exploded? What do you mean by transaction graph (here I feel I need to read some theory of ethereum, I have material)? What is the difference between static analysis and dynamic values? Shall we discuss also here the best approach for the implementation?

kiview commented 4 years ago

How did you realize that the faucet was being exploited?

MPDL observed suspicious interaction with the Faucet in their web endpoint monitoring. It was determined that these interactions came seemingly from a single account, therefore an exploit.

What do you mean by transaction graph?

A list of transactions can be considered a graph with addresses being the vertices and the transactions being the (directed) edges, can't it?

What is the difference between static analysis and dynamic values?

I would consider static values the ones, that are static properties of the resulting graph with time-based being the ones, that are observed in a certain time frame and are therefore time frame specific.

Possible implementation approach: Create a Jupyter notebook and hack in the datascience code for doing the analysis there, e.g. by using some Python graph libs like NetworkX.

ghost commented 4 years ago

offtopic: cool edit reply :)

MPDL observed suspicious interaction with the Faucet in their web endpoint monitoring. It was determined that these interactions came seemingly from a single account, therefore an exploit.

Account is not address, or yes?

A list of transactions can be considered a graph with addresses being the vertices and the transactions being the (directed) edges, can't it?

I see, so, thinking about it: A block has transactions, and the transaction the sender and receiver addresses. Then the idea of creating a graph is getting these senders and receivers and have them as nodes and the transactions as the relationship between them, right?

I would consider static values the ones, that are static properties of the resulting graph with time-based being the ones, that are observed in a certain time frame and are therefore time frame specific.

Possible implementation approach:

Ok, so now I switch to Python. Am I right in using the JSON file generated by our Java artifact? Or what are you already running to get the Blocks?

kiview commented 4 years ago

Account is not address, or yes?

What is an account in the context of Ethereum? It is a public wallet address. This is the R code for create the graph and visualizing it, however, analysis, of course, needs to do more than just visualize it. https://github.com/internet-sicherheit/ethereum-cache-creator/blob/master/import_test.R

Then the idea of creating a graph is getting these senders and receivers and have them as nodes and the transactions as the relationship between them, right?

Yes, since this is paraphrasing what I said: A list of transactions can be considered a graph with addresses being the vertices and the transactions being the (directed) edges.

This is the graph of a small set of transactions: image

ghost commented 4 years ago

This is the R code for create the graph and visualizing it, however, analysis, of course, needs to do more than just visualize it. https://github.com/internet-sicherheit/ethereum-cache-creator/blob/master/import_test.R I installed R and tried running the script, however I get this message, what am I doing wrong?

Rscript import_test.R 
Loading required package: rjson
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Warning in install.packages("rjson") :
'lib = "/usr/lib64/R/library"' is not writable
Error in install.packages("rjson") : unable to install packages
Execution halted
kiview commented 4 years ago

Could you please just read the error message and solve it accordingly? I think it is pretty clear: 'lib = "/usr/lib64/R/library"' is not writable

Googling to find out how to solve it specifically for your setup might be in order. However, I don't think there is a super big value in trying to solve your R installation if I mentioned before, that using Python with Jupyter notebooks might be an easier approach for all of us.

ghost commented 4 years ago

Could you please just read the error message and solve it accordingly? I think it is pretty clear: 'lib = "/usr/lib64/R/library"' is not writable

Yes, of course it is understandable, but I thought that the script was just ready to run, that I didn't have to do anything extra.

Googling to find out how to solve it specifically for your setup might be in order. However, I don't think there is a super big value in trying to solve your R installation if I mentioned before, that using Python with Jupyter notebooks might be an easier approach for all of us.

My intention was testing and playing with it to have the graph of the transactions myself

kiview commented 4 years ago

It is ready to run by itself if the R environment is installed in a regular way (worked in Conda as well as with native Ubuntu and Fedora installation for me). Solving this problem for your specific environment is basically a matter of googling.

Or simply use the R Docker image, just tried it out, works directly.