1712n / dn-institute

Distributed Networks Institute
http://dn.institute/
The Unlicense
27 stars 68 forks source link

Wash Trading Detection #240

Closed evgenydmitriev closed 11 months ago

evgenydmitriev commented 3 years ago

Wash trading is a process whereby a trader buys and sells a security for the express purpose of feeding misleading information to the market. In some situations, wash trades are executed by a trader and a broker who are colluding with each other, and other times wash trades are executed by investors acting as both the buyer and the seller of the security.

Investopedia

In the crypto industry, wash trading is a common occurrence, as many traders equate high trading volumes to healthy market liquidity. Trading venues and crypto projects are complacent and often collude with each other to generate fake trades that cost little to place and execute. You can find more technical details on wash trading techniques in this YouTube video.

Please comment below with ideas on detecting wash trading based on a stream of executed trades. More specifically, describe an algorithm you would use to create a metric capable of measuring wash trading in a given market pair volume. Feel free to support your ideas by adding references, datasets, graphs, and code. Comments with the best ideas will be hidden to allow others to participate. Multiple submission awards are available.

Many of the previous challenge participants focused on investigative approaches that involved manual analysis of specific cases of wash trading. This, however, is an engineering challenge, requiring successful submissions to include an algorithm, supported by references, datasets, graphs, and/or code. We have more than enough of ingenious ideas on how it can be done, but no solid plans of how to implement it using real-time streaming data.

jribarich commented 2 years ago

An algorithm I would create to detect wash trades in cryptocurrency trade streams would be one that utilizes two queue data structures as well as dynamic programming as described in the publication I have referenced at the bottom of this text.

The queues in this algorithm would represent the buy and sell orders of a specific trader or multiple traders. In this study they tested the algorithm on one, two, and four traders to find if collusion had occurred. The queues each have a designated size which can be changed for the specific time frame in which we want to detect a wash trade.

Once the queue reaches a certain size or threshold, a volume match occurs which compares the volumes of each trade made to see if it's within a margin percentage of another trade. In this study, they used margins ranging from 0-5% with 5% volume margins having the highest false positive rate.

If there is a volume match within the two queues, I would then use dynamic programming to determine if a wash trade occurs resulting in a zero-sum of the two sides (buy and sell). Once again there is a threshold that needs to be defined as some wash trades might not exactly add up to zero, but maybe within a certain percentage of it.

Due to the high speeds in which trading occurs nowadays, this algorithm would have to use parallel processing or a supercomputer to capture these wash trades and alert the right authorities. I feel given how easy it is to create a crypto-wallet this algorithm could be ran on groups of 5, 10, or 15 traders that plan on colluding with each other.

References:

ildkhav commented 2 years ago

Hello, please find a report https://github.com/ildkhav/Wash-Trading-Detection/blob/d5cfcc6e8e4505a13291f300966c562e7409186b/report.pdf and the code https://github.com/ildkhav/Wash-Trading-Detection/blob/d5cfcc6e8e4505a13291f300966c562e7409186b/report.R

evgenydmitriev commented 2 years ago

The challenge is still open. Many of the challenge participants focused on investigative approaches that involved manual analysis of specific cases of wash trading. This, however, is an engineering challenge, requiring successful submissions to include an algorithm, supported by references, datasets, graphs, and/or code. We have more than enough of ingenious ideas on how it can be done, but no solid plans of how to implement it using real-time streaming data.

marina-chibizova commented 11 months ago

Big thanks to all challenge participants! We've honed our methodology & metrics with the help of a brilliant team member hired from the challenge program. Now, we're on the lookout for any valuable contributions to our Market Manipulation Wiki (in a form of market manipulation report, documentation, fixes, metrics suggestions, etc) - check out the latest bounty!