Open fhferreira opened 1 year ago
Alot of times, fraud detection can be framed in the context of anomaly detection which is an unsupervised approach. The problem with a supervised approach is that it is sometimes not practical to accumulate enough labeled samples that represent fraud situations. The prior probability is just too low i.e. people are generally honest. Fortunately, this skew is acknowledged and handled by most Anomaly Detectors by adjusting the contamination
hyper-parameter.
https://docs.rubixml.com/2.0/what-is-machine-learning.html#anomaly-detection
If you took this approach, you can start with a simple Anomaly Detector such as Gaussian MLE and if you need more flexibility, Loda and Isolation Forests work pretty well.
If you went with a supervised approach, you can train a classifier to classify "fraud" or "not fraud" but be mindful if you are using a highly imbalanced dataset (mostly not fraud samples). Some classifiers such as Random Forest will compensate for imbalanced datasets, but it's no substitute for actually having more data to represent the fraud case.
https://docs.rubixml.com/2.0/what-is-machine-learning.html#classification
Hope this helps!
andrewdalpino
tks man, helped a lot.
I am checking a solution to prevent "fraudster" to create "store/ecommerces" to sell products as a fraud only.
Example: Product: Stove brand Consul Price: 100 Real price at normal shoppings: 500
Product: Washing machine Eletroclux Price: 119 Real price at normal shoppings: 900
I am new in Machine Learning, so I would like a suggestion.