OpenMined / opus

Apache License 2.0
22 stars 9 forks source link

Fraud prevention - How do we prevent fake accounts from accumulating too much identity verification? #17

Open carrollgt91 opened 4 years ago

carrollgt91 commented 4 years ago

In every online-enabled system, fake accounts run rampant. If we're using primarily SSO for verifying a user's individuality, even the strongest SSO accounts integrations for identity verification ( i.e. banks) could be fraudulently acquired.

For example, as an individual in the US, I could have bank accounts with 3 banks, each with SSO. I could sell access to two of those accounts to others, who could then use those accounts to bolster the identity of a fake account.

carrollgt91 commented 4 years ago

One way of thinking about addressing this is attempting to reach "across" SSO accounts to verify individuality. Especially if we're using tools like web scraping in order to collect more information than is present in the SSO APIs, then we could look to confirm various identifying bits of information about the individual that way.

For example, we could obtain "name", "address", "phone number", and "email" from each provider, and temporarily cache them while we perform this comparison. If they "mostly" align, for some suitable definition of "mostly", we can allow each integration to be marked as "valid". If the information differs too heavily (i.e. separate addresses for two separate services, or there's no overlap between verifiable information at all), we could warn the user that the account was not integrated due to a lack of cohesion.

kevinahuber commented 4 years ago

Maybe a good exercise would be to try to define "mostly" and pull down our own data across our initial target integrations.

Another question for this approach would be how does it start? What would the minimum threshold be for us to start enforcing "cohesion"?

Also, what sort of people would this exclude? Is there a higher percent of "cohesion" for particular populations?

chaitanyajun12 commented 4 years ago

Wouldn't it be the case that even if there is a strong cohesion as well, its not guaranteed that the user's identity is verified?

The issue here is, let us say if user A accounts are managed or fraudulently acquired by user B and B is using them in PIS system to identify himself. So, our job is to find user B type of users to have minimum cohesion score sorts or not let the consumers use his conclusive data.

One solution I can think of is biometric verification of the individual. For instance, there is Aadhaar card in India for identity verification where they collect biometric information of the individual in question. I think same would be the case for SSN as well. I dont know if there will be a way to access this biometric data. In my opinion this is the strongest verification possible.

Assuming that we have biometric data available and when the consumer app needs to access conclusive data the user has to undergo biometric (finger print etc.,) scan and within the PIS system we have to validate the match before proceeding further.

carrollgt91 commented 4 years ago

Yes, I think that's a good approach, @kevinahuber - getting a sense of what could be obtained from an initial suite of integrations, both from the APIs and "web scraping" - in this case, visually verifying we can obtain the data in question, would help.

The minimum threshold is a tricky problem - I think that will be inherently tied to how many accounts can provide verified information, how many folks have said accounts, on average, and other variables that we don't have enough information to grasp. I think this goes for defining "mostly cohesive", as well.

carrollgt91 commented 4 years ago

@chaitanyajun12, totally agree that this is also a non-starter. I think biometric verification is very important, but at least in the USA, FaceID and Fingerprint scanning is only implemented for mobile technologies, and is entirely decoupled from identity for privacy purposes (at least in the Apple ecosystem - I'm less familiar with Android).

I do think these technologies have a solid place in this conversation, especially in countries that have more accessible systems, but I think the problem your posing might fit more neatly into this issue: #16