kenoskynci / opendlp

Automatically exported from code.google.com/p/opendlp
0 stars 0 forks source link

Too many false positives/results have letters and symbols #111

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1. Ran a basic scan using standard Visa, Mastercard, AMEX RegEX.

What is the expected output? What do you see instead?

I would have expected results resembling credit card numbers.  Instead I'm 
seeing results with patterns such as:

Visa    XXXXXXXXXXXXXXXX237C
Visa    XXXXXXXXXXXXXX417<
Mastercard  XXXXXXXXXXXXXX475<
Mastercard  XXXXXXXXXXXXXX594?
AMEX    XXXXXXXXXXXXX495<
Visa    XXXXXXXXXXXXXX704D
Social_Security_Number_dashes   XXXXXXXXXX33-
Social_Security_Number_spaces   XXXXXXXXXX17T

What version of the product are you using? On what operating system?

0.5.1 on OpenDLP pre-built VM (Ubuntu 11.04)

Please provide any additional information below.

I am testing out OpenDLP on one system that I know is free of CC information, 
yet it is showing over 500 findings (and the scan is only 13% complete).  All 
of the findings are like the above.

As a feature request, is there any way to have the program include the Luhn 
algorithm to help reduce false positives on CC numbers?

Thank you.

Original issue reported on code.google.com by cableguy...@gmail.com on 20 Dec 2013 at 5:51