Closed GillesInnov35 closed 5 months ago
Hi Gilles, I have some feedbacks: Request Specifications
We can still include the "address" attribute but discourage in some way the use of it.
Response Specifications Since the answer will be Y, N-NA, N-AV, N-AD the term "score" could be misleading. I am still thinking to a valid alternative to propose, but I can't figure it out now.
Thanks very much, @GillesInnov35, for creating this issue.
May I ask questions for clarification?
I have one comment: we have agreed that calculating matching score is for our future releases, so, it should not be included for our initial release.
Thanks.
Hi @ToshiWakayama-KDDI we agreed to not include the matching score. But we also agreed that the score is something we need to "take into consideration in some way" since we will work on it as soon as the first version of those specifications is released. Hence, I think it’s important to do now something to enable future improvements.
@GillesInnov35, I figured out my proposal: In order to find a "middle way" between future developments and Toshi pressure for next-to-come first milestone, we can still use "_match" suffix on response attributes. In that way we can address our future discussions on modifying just the "Y" response. Maybe could it be "Y-nn" where nn is the score? Let's keep the proposals for future discussions.
Hi @StefanoFalsetto-CKHIOD , @ToshiWakayama-KDDI Thanks a lot for your comments. I'll try to explain the proposition Phone number rather than msisdn
use of address
GSMA Mobile Connect KYC Match
Hi @GillesInnov35 I think that this table is lacking Telefonica's proposal too and some of our fields (like idDocument) and vision for the properties. Can you please update accordingly? Thanks!
Regarding use of address In our proposal, the address field is composed of the different parts it can have. We consider that having a single field in which the postal address can be included in such a generic way adds complexity and, as @StefanoFalsetto-CKHIOD said, is very country-dependent. So we support having different fields for its representation.
Hi @fernandopradocabrillo , yes sure
could you send me the list of atributes Telefonica proposes in its solution. thanks
Hi, @GillesInnov35, here the Telefonica's proposal mentioned by @fernandopradocabrillo : https://github.com/camaraproject/KnowYourCustomer/blob/f153a4799213fc4b0474d156c7b10b490015439e/code/API_definitions/kyc-match.yaml#L143 which can be summarized in: phoneNumber, idDocument, identity (composed of firstName and lastName), address (composed of postalCode, streetName and streetNumber), and birthdate. And the responses would be xxxx_response for each of them.
Hi all,
Please find the revised shortlist table below. I have added our proposed parameters/attributes, which are included in our YAML file, to the shortlist table. Also I have added Telefonica's parameters/attributes as well. Hope it is correct. Also I have changed GSMA Match to MobileConnect Match and moved it to the right as MobileConnect is not our proposal.
I have one point to ask you at the moment: Our company differentiates Subscriber (who makes contract with us) and User (who actually uses the phone). For example, Pararent is Subscriber and their child is User. Do you have the same kind of differenitation?
Match Request Body
CAMARA KYC Match requirements/categories | KDDI KYC Match | Orange KYC Match | Telefonica KYC Match | GSMA KYC Match | Orange Proposal |
---|---|---|---|---|---|
Phone Number | subscriber_phone_number_match | msisdn | phoneNumber | phone_number | phoneNumber |
(special phone number) | main_subscriber_phone_number_match | ||||
ID Document | idDocument | ||||
Subscriber name | user_name_match | name | identity (composed of firstName and lastName) | name | name |
(name reading) | subscriber_name_kana_hankaku_match | ||||
(name reading) | subscriber_name_kana_zenkaku_match | ||||
(given name) | given_name | (included in identity) | given_name | givenName | |
(family name) | family_name | (included in identity) | family_name | familyName | |
Subsscriber Postal Code | subscriber_postal_code_match | postalCode | (included in address) | postal_code | |
Subscriber Address | subscriber_formatted_match | address (composed of postalCode, streetName and streetNumber) | address | address | |
(street name) | street_name | (included in address) | house_or_housename | streetName | |
(street number) | (included in address) | ||||
Subscriber Address-Region | subscriber_region_match | ||||
Subscriber Address-Town | locality | locality | locality | ||
Subscriber Address-Country | country | country | country | ||
Subscriber Birthdate | subscriber_birthdate_match | birthdate | birthdate | birthdate | birthdate |
Subscriber Email Address | |||||
User Name | user_name_match | ||||
(user name reading) | user_name_kana_hankaku_match | ||||
(user name reading) | user_name_kana_zenkaku_match | ||||
User Birthdate | user_birthdate_match | ||||
3rd party ID | cp_id | ||||
service_id |
KYC Match Response
CAMARA KYC Match requirements/categories | KDDI KYC Match | Orange KYC Match | Telefonica KYC Match | GSMA KYC Match | Proposal |
---|---|---|---|---|---|
Phone Number | subscriber_phone_number_match | msisdn | phoneNumber_response | phone_number | |
(special phone number) | main_subscriber_phone_number_match | ||||
ID Document | idDocument_response | ||||
Subscriber name | subscriber_name_match | name_score | identity_response | name | |
(name reading) | subscriber_name_kana_hankaku_match | ||||
(name reading) | subscriber_name_kana_zenkaku_match | ||||
(given name) | given_name_score | (included in identity) | given_name | ||
(family name) | family_name_score | (included in identity) | family_name | ||
Subsscriber Postal Code | subscriber_postal_code_match | postalCode_score | (included in address) | postal_code | |
Subscriber Address | subscriber_formatted_match | address_response | address | ||
(street name) | street_name_score | (included in address) | house_or_housename | ||
(street number) | (included in address) | ||||
Subscriber Address-Region | subscriber_region_match | ||||
Subscriber Address-Town | locality_score | locality | |||
Subscriber Address-Country | country_score | country | |||
Subscriber Birthdate | subscriber_birthdate_match | birthdate_score | birthdate_response | birthdate | |
Subscriber Email Address | email_score | ||||
User Name | user_name_match | ||||
(user name reading) | user_name_kana_hankaku_match | ||||
(user name reading) | user_name_kana_zenkaku_match | ||||
User Birthdate | user_birthdate_match |
Many thanks, Toshi
Hi all,
Also please find the below a short list table for KYC Fill-in attributes/parameters based on our Fill-in YAML.
Any comments would be welcome.
Fill-in Request Body
CAMARA KYC Fill-in requirements/categories | KDDI KYC Fill-in | No other Fill-in proposals | Proposal |
---|---|---|---|
3rd party ID | cp_id |
Fill Response
CAMARA KYC Fill-in requirements/categories | KDDI KYC Fill-in | No other Fill-in proposals | Proposal |
---|---|---|---|
Phone Number | subscriber_mobile_phone | ||
Subscriber name | subscriber_name | ||
(family name) | subscriber_name_family | ||
(given name) | subscriber_name_first | ||
(name reading) | subscriber_name_kana_hankaku | ||
(family name reading) | subscriber_name_kana_hankaku_family | ||
(given name reading) | subscriber_name_kana_hankaku_first | ||
(name reading) | subscriber_name_kana_zenkakuku | ||
(family name reading) | subscriber_name_kana_zenkaku_family | ||
(given name reading) | subscriber_name_kana_zenkaku | ||
Subsscriber Postal Code | subscriber_postal_code | ||
Subscriber Address | subscriber_formatted | ||
Subscriber Address-Region | subscriber_region | ||
Subscriber Birthdate | subscriber_birthdate | ||
Subscriber Gender | subscriber_gender | ||
Subscriber Email Address | subscriber_mail_address | ||
User Name | user_name | ||
(user family name) | user_name_family | ||
(user given name) | user_name_first | ||
(name reading) | user_name_kana_hankaku | ||
(family name reading) | user_name_kana_hankaku_family | ||
(given name reading) | user_name_kana_hankaku_first | ||
(name reading) | user_name_kana_zenkakuku | ||
(family name reading) | user_name_kana_zenkaku_family | ||
(given name reading) | user_name_kana_zenkaku | ||
User Birthdate | user_birthdate |
Many thanks, Toshi
@ToshiWakayama-KDDI, Orange KYC offers differentiate also subscriber and user. The 3-Legged authentication architecture is based on user information who authenticates and should consent. But information returned by the service concern the subscriber who signed the contract. @fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?
I have a question regarding Toshi's proposition included language information (user_name and user_name_kana_hankaku). Does it mean we should introduce a dataType attribute valued with InternationUserClass, JapaneseUserClass, etc. This kind of information to type the data has been for example included in DeviceLocation API definition. Gilles
@ToshiWakayama-KDDI, Orange KYC offers differentiate also subscriber and user. The 3-Legged authentication architecture is based on user information who authenticates and should consent. But information returned by the service concern the subscriber who signed the contract. @fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?
Hi @GillesInnov35 , Thanks. Just to double check, address, name, email etc. that are currently proposed by Orange are all for Subscribers???
Thanks.
I have a question regarding Toshi's proposition included language information (user_name and user_name_kana_hankaku). Does it mean we should introduce a dataType attribute valued with InternationUserClass, JapaneseUserClass, etc. This kind of information to type the data has been for example included in DeviceLocation API definition. Gilles
Hi @GillesInnov35 , Thanks for the information! I have just looked at DeviceLocation YAMLs, but I could not find it (dataType). Could you advise me which YAML has it (dataType)?
Thanks
Hi @ToshiWakayama-KDDI ,
yes, information returned or compared by the Orange Match ID API concern only subscriber's information.
in the DeviceLocation API deifnition the attribute which specifiy the type of the class is areaType (circle or polygon).
Hi @ToshiWakayama-KDDI ,
- yes, information returned or compared by the Orange Match ID API concern only subscriber's information.
- in the DeviceLocation API deifnition the attribute which specifiy the type of the class is areaType (circle or polygon).
Hi Gills @GillesInnov35 , Thank you very much. I will look into it quickly, together with my internal colleagues.
@fernandopradocabrillo, idDocument is part of TF API Match design. The type of concerned document is never mentioned ?
Hi @GillesInnov35 , That's correct, we decided not to include the idDocument type in the proposal since it added unnecessary complexity. In the end we want to check if the idDocument provided matches the one stored by the MNO, the important thing here is the number itself.
@ToshiWakayama-KDDI From our side, we also do the match against subscriber's information only.
Hi @fernandopradocabrillo , Thank you.
Hi @GillesInnov35 , I have quickly checked 'areaType' in location-retrieval API and location-verification API, but I am not immediately quite sure if we could introduce UserClass attributes in the similar way for our purpose. Anyway, at the mobment, we do not consider introducing new attributes like UserClass, as it is better for us to keep our first version simple with only required attributes.
Many thanks,
Thanks a lot @ToshiWakayama-KDDI As we are currently discussing about what attributes should be mandatory my question was: Do specific attributes kana should be part of KYC-Match request definition ?
In the Netherlands, we currently have the following attribute list in use:
We don't use street name, town etc because in the Netherlands postal code + house number + house number extension is very exact already.
We already have relatively high match rates (up to 80% for family name). Nevertheless, I think we can still improve by the following: In stead of Given Name Initials, use the following attributes in parallel:
Often people only record their first Given Name or Initial (although many have multiple Given Names). The use of initials can help for cases where there are multiple ways how to write a given name (for example Steve and Stephen).
In the Netherlands we have a list of prefixes that we usually strip from the family name. The reason we do this is that the prefixes can be abbreviated, which hinders the matching. What we can add is an extra attribute in which you compare these prefixes.
For Family Name, I think we can improve by adding the Family Name at birth as a separate attribute. In the Netherlands, your familiy name can change when you get married, so this may change during your life time. Your Family Name at birth never changes, and when available, it is better for matching because it stays constant.
Streetname we do not use, because our postal code + house number + housenumber extension is very exact.
So, we would propose the following list (for NL):
Annex B - MC Product Specification - Match, v1.4.xlsx
Attached also the list of specs we currently use for Match in NL. It also includes the list of prefixes we strip from family name
Thanks @HuubAppelboom I think we should be able to identify a short list of common attributes to all designs and propose a first draft.
I agree with @GillesInnov35 in the sense that I think we should see it from the perspective of a Service Provider that is asking a user for some contact information, and shows a form to collect several fields of data. Then, IMHO, and recongizing I don't know the habits in the Netherlands, I don't think the Service Provider is going to ask the user for, for example, all potential ways of expressing their name, but will ask for the most common way to express the name in that country.
I agree with @GillesInnov35 in the sense that I think we should see it from the perspective of a Service Provider that is asking a user for some contact information, and shows a form to collect several fields of data. Then, IMHO, and recongizing I don't know the habits in the Netherlands, I don't think the Service Provider is going to ask the user for, for example, all potential ways of expressing their name, but will ask for the most common way to express the name in that country.
The issue is not that we think that Service Providers should ask end users for all different possible variations that you can have, but that MNO's and Service Providers have a history and way of working in collecting the data. For example, in the Netherlands we have a couple of MNO's which only have collected initials. Making Given Name(s) the only option will not work in this case (that's why we have chosen for initials-only in the Netherlands, deviating from the Mobile Connect standard).
The other issue you have is when you ask for matching all initials (or given names), and provide that as the only option, you will see that often 2nd and rd initials are missing in current databases (at least we have seen that), which results in a lower match rate than you could have. That's why we propose to make several attribute fields available in the standard, and that you match on all field that you have available. The same principle would apply for family name, if you have the family name at birth also available, that you can aso provide a match on this. In the end , you can safely get to a higher overall match rate through this, without the need to go to more complex solutions like a match score based on whether the attributes are similar.
As far as the availability of data is concerned, in case the MNO does not have a specific attribute in their CRM system, you can always answer with "NA".
Thanks a lot @ToshiWakayama-KDDI As we are currently discussing about what attributes should be mandatory my question was: Do specific attributes kana should be part of KYC-Match request definition ?
- If Yes, dataType used a discriminator would be useful to avoid duplication of concerned attributes
- if no, there's no need to differentiate 2 schemas
Hi @GillesInnov35 ,
Thanks very mucy. First of all, my understanding is that we are not discussing mandatory attributes, but that all attributes should be optional, as I shared on Tuesday. Surely we need mandatory requirement like 'at least one attribute should be included in a API match request'.
So, to answer your question, we would like to have specific attributes kana etc. part of KYC-Match request definiton, as one of the options.
Then I understand your point that dataType used as a discriminator would be useful to avoid duplication of concerned attributes, and I think I need to look into it.
Many thanks, Toshi
Hi @HuubAppelboom , @javier-carrocalabor , @GillesInnov35 ,
Thank you, all, for your comments. Now I understand the Netherlands needs some spedific attributes. As I shared on Tuesday, I would propose to include all the required attributes, both of commonly used attributes and country/market specific attributes, if we categorise, in our first version. I think that all the attributes should be Optional, as it seems there are many ways to use this API/KYC-Match functionality so it is difficult to identify mandatory ones. Of course, we need some mandatory requirement like 'there should be at least one attribute incuded in a API request'.
If you think we may need categorisation of Common attributes and Country/Market specific attributes, we could write it down somewhere in YAML or in API documentation.
What do you think?
Many thanks, Toshi
Hi @ToshiWakayama-KDDI,
I would indeed support to include all attributes, and include both commonly used and country/market specific attributes. As a rule, I would suggest that when you can, you support all attributes for which you have data for.
For example, for NL we currently do not support streetname (because it is not necessary here), but for the sake of international compatibility we will implement it.
On the customer side, the customer can always choose which attributes will be asked to be matched (with the minimum of one of course). For example, for some cases we only need address verification and nothing else, because the customer is already using a different source for the name, date of birth, email etc.
What should also be prevented is that customers start offering data in case they don't have it, because this will give you wrong match rate statistics. For example, we had one customer that did not have Date of Birth data, so in stead they always submitted "YYYY-MM-DD" as a hashed string, which ofcourse never matches, or a dummy date like "1900-01-01". You will get low match rates, and it really take some time to find out what is going wrong. So in any case, customers must always submit valid data, and not dummy data.
With kind regards Huub
Hi @ToshiWakayama-KDDI , term mandatory was not appropriate because as you say all attributes should be optional of course (except phone number). I was meaning attibutes we'd like to see in the API design (will be common attributes). Thanks a lot
As I said in some other comments, I will be happy to discuss about deprecating the address attribute. It is far better (for many countries around the world) to use different attributes for the single address components.
In order to find the right initial set of attributes, I am sharing the full set of attributes that CKH (and hence all the affiliates operators) are offering to its Partners. As you can see we are supporting all the attributes defined into the GSMA IDY.28 specifications plus some custom ones (e.g., the age verification). Some of the address related attributes such as houseno_or_housename_hash
are used for historical reasons, but will be deprecated in future. Moreover, some of the custom attributes are calculated on the fly by managing atomic data obtained from MNOs (e.g., age and age_is_greater_than are calculated using the birthdate).
Requested Attribute | Returned value |
---|---|
account_state |
Active/Inactive |
age_hash |
True/False |
age_is_greater_than |
True/False |
address_line1_hash |
Y/N-NA/N-AV |
address_line2_hash |
Y/N-NA/N-AV |
billing_segment |
PAYM/PAYG |
birthdate_hash |
Y/N-NA/N-AV |
city_or_province_hash |
Y/N-NA/N-AV |
country_hash |
Y/N-NA/N-AV |
email_hash |
Y/N-NA/N-AV |
family_name_hash |
Y/N-NA/N-AV |
flat_number_hash |
Y/N-NA/N-AV |
gender_hash |
Y/N-NA/N-AV |
given_name_hash |
Y/N-NA/N-AV |
house_name_hash |
Y/N-NA/N-AV |
house_number_hash |
Y/N-NA/N-AV |
houseno_or_housename_hash |
Y/N-NA/N-AV |
is_adult |
True/False |
is_age_verified |
True/False |
is_email_verified |
True/False |
is_lost_stolen |
True/False |
middle_name_hash |
Y/N-NA/N-AV |
postal_code_hash |
Y/N-NA/N-AV |
title_hash |
Y/N-NA/N-AV |
town_hash |
Y/N-NA/N-AV |
As I said in some other comments, I will be happy to discuss about deprecating the address attribute. It is far better (for many countries around the world) to use different attributes for the single address components.
Hi @StefanoFalsetto-CKHIOD ,
Thank you for the comment, but I think the address attribute is required. As you pointed out in your previous comment, the aggregated field is depending on country rules, which I think is true, and in some countries like Japan Customers need the aggregated address field, mainly because it is difficult to split our address into separete fields.
I think both of the aggregated address field and split address fields can exist as optional fields. If a MNO does not support a specific attribute and the MNO is asked about the specific attribute, it can answer with Not_Available or something. It may be better to share what attributes are supporeted by a MNO and which are not, but this would be a Business matter or could be our future topic.
Thanks, Toshi
Hi @GillesInnov35 , @fernandopradocabrillo , @javier-carrocalabor , @HuubAppelboom , @StefanoFalsetto-CKHIOD ,
Thank you for your comments. I feel our discussion is spreading and exploding (sorry I don't know the proper word) and we have to start converging our discussion, considering our target time.
I have some suggestion for converging our discussion as below:
Regarding Age attributes, it needs some calculation and also it is related to the new API 'Age Verification', so, I would suggest to delay it for future enhancement.
Regarding attributes requiring caluculation or processing, I would suggest to delay them for future enhancement. We have agreed to delay Match Scoring for future enhanement, and Hashing and Age are the same. (Because solution discussion is needed and it would take time.)
Regarding attributes not related to subscribers/users, e.g. account active/inactive, I would suggest to delay them for future enhancement. (Because we need to discuss it is required or not, as it is unclear whether it is KYC information.)
Regarding any attributes requiring complex and deep discussion, I would suggest to delay them for future enhancement. (Because of our short time.)
Any views?
Considering No.4 above, we can agree to delay User information attributes (separete from Subscriber/Contractor information) for future enhancement.
Thanks, Toshi
Hi, I agree with @ToshiWakayama-KDDI proposition to target to a limited list of attributes in this first version even if it does not cover the full scope of existing offers. If we have a look at what proposes TMForum (which is a main standard) for a party/individual resource, the list of attributes which define a person is limited to few fields. It means that such a list already exists in others specifications. It could be a good example, right ?
for information, see bellow some of fields in TMF 632 party (individual) specifications
"givenName": "Jane",
"familyName": "Lamborgizzia",
"legalName": "Smith",
"middleName": "JL",
"fullName": "Jane Smith ep Lamborgizzia",
"formattedName": "Jane Smith ep Lamborgizzia",
"birthDate": "1967-09-26T05:00:00.246Z",
Geographic address
"city": "Morristown",
"country": "USA",
"postCode": 7960,
"stateOrProvince": "New Jersey",
"street1": "240 Headquarters Plazza",
"street2": "East Tower - 10th Floor"
ContactMedium
"emailAddress": "jane.lamborgizzia@gmail.com"
"phoneNumber": "+112785426565"
As formattedName exists for Name, a formattedAddress could be added for aggregation of fields of address .
I agree with @ToshiWakayama-KDDI points in the shake of simplification and, at the same time, to find a common base that can cover most of the needs. Particularly, I see @GillesInnov35 proposal (https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1835672991) a good starting point to achieve this.
In my experience, you will need a minimum list of attributes that is needed to properly identify a person.
Unfortunately this list of what is needed varies per country. So, in case you want to work cross border, you will always see a longer list than what is the minimum for a country.
In addition, for the case of a matching proces based on hashes, some attributes cause problems that make these less suitable. For example, in the TMF 632 list above "fullName": "Jane Smith ep Lamborgizzia" will give problems , because there is an abbrevation "ep" used which is probably language dependent. In the netherlands we have for example "ev" or "wv", or we use a "-" symbol, where it can be Smith-Lamborgizzia or Lamborgizzia-Smith. And other markets have their own habits, which are different. This is exactly the reason why we would like to have the family name at birth as an extra attribute. If the current fullName does not match, but the familiy name at birth matches, you still know who it is with sufficient precision.
I don't think it is wise to make the list of attributes as small as possible, because you will run the risk that it becomes too small to be of any use. And for markets where Match is alerady being used, it makes no sense to come with an API which is less effective.
What may be more pragmatic is to start in one or 2 countries, and define the API there, by defining the minimum what is needed in these markets (and have a better offering than the current Match product). And do a market by market introduction, and in each market add the attributes that are needed to have minimum set for that market as well. This way you will have a growing list of attributes over time.
PS. In the current EIDAS2 wallet standardisation process in Europe there is also a PID being defined (a list of attributes), that may be worth to take a look at.
Hi @HuubAppelboom, I understand your point of view regarding your experience hower to my opinion CAMARA approach is to think to a global solution which could be adopted by much operators and partners. If we think "country" from the start I'm not sure it will be the case. I don't clearly understand why we could not start with a limited list of attributes for which MNO should be able to compare information and return a match result, even if I agree with you that in some use cases the match result would not be so helpful depending on the expected trusted level. If we think code, polymorphism should help us to define new specific schemas inheriting from this first base and perhaps targeting specific countries's requirements. I don't know if my vision is clear enough. I'll also discuss about that internally with my colleagues. Thanks a lot
Hi @GillesInnov35 ,
For example for the Netherlands I don't think any telco would start introducing a CAMARA version of KYC Match that is sigificantly less than what is available already available today.
Kind regards
Huub
Hi all,
Based on our discussions, I have crated a compromised proposal by updating our initial proposed table (Gilles on 16th Nov and me on 20th Nov), as below. Paramters/attributes in the rightmost columns are my proposal.
Please note each of the proposed parameters/attributes has a Match suffix, but this is just my proposal and we have to discuss suffix for Request and Response separately, so, please check what parameters/attributes we need for our initial version.
I think we have to conclude our parameter/attribute discussion within this week, so any comments are welcome.
Match Request Body
CAMARA KYC Match requirements/categories | KDDI KYC Match | Orange KYC Match | Telefonica KYC Match | GSMA KYC Match | Orange Proposal | KPN | Hutchison | Compromised Proposal |
---|---|---|---|---|---|---|---|---|
Phone Number | subscriber_phone_number_match | msisdn | phoneNumber | phone_number | phoneNumber | phoneNumberMatch | ||
(special phone number) | main_subscriber_phone_number_match | mainPhoneNumberMatch | ||||||
ID Document | idDocument | idDocumentMatch | ||||||
Subscriber name | user_name_match | name | identity (composed of firstName and lastName) | name | name | nameMatch | ||
(name reading) | subscriber_name_kana_hankaku_match | nameKanaHankakuMatch | ||||||
(name reading) | subscriber_name_kana_zenkaku_match | nameKanaZenkakuMatch | ||||||
(given name) | given_name | (included in identity) | given_name | givenName | givneNameMatch | |||
(family name) | family_name | (included in identity) | family_name | familyName | familyNameMatch | |||
Subsscriber Postal Code | subscriber_postal_code_match | postalCode | (included in address) | postal_code | postalCodeMatch | |||
Subscriber Address | subscriber_formatted_match | address (composed of postalCode, streetName and streetNumber) | address | address | addressMatch | |||
(street name) | street_name | (included in address) | house_or_housename | streetName | streetNameMatch | |||
(street number) | (included in address) | streetNumberMatch | ||||||
Subscriber Address-Region | subscriber_region_match | regionMatch | ||||||
Subscriber Address-Town | locality | locality | locality | localityMatch | ||||
Subscriber Address-Country | country | country | country | countryMatch | ||||
Subscriber Birthdate | subscriber_birthdate_match | birthdate | birthdate | birthdate | birthdate | birthdateMatch | ||
Subscriber Email Address | emailMatch | |||||||
Subscriber name (Initial of the first Given Name) |
(Initial of the first Given Name) | firstGivenNameMatch | ||||||
(All initials of Given Names) | (All initials of Given Names) | allGivenNamesInitialsMatch | ||||||
(The first Given Name) | (The first Given Name) | firstGivenNameMatch | ||||||
(All Given Names) | (All Given Names) | allGivenNamesMatch | ||||||
(Prefixes of the Current Family Name) | (Prefixes of the Current Family Name) | currentFamilyNamePrefixesMatch | ||||||
(Family Name at birth) | (Family Name at birth) | familyNameAtBirthMatch | ||||||
Subscriber Address (House Number Extension) |
(House Number Extension) | houseNumberExtensionMatch | ||||||
Subscriber Gender | subscriber_gender_match | genderMatch | ||||||
3rd party ID | cp_id | cp_id | ||||||
service_id | service_id |
KYC Match Response
CAMARA KYC Match requirements/categories | KDDI KYC Match | Orange KYC Match | Telefonica KYC Match | GSMA KYC Match | KPN | Hutchison | Compromised Proposal |
---|---|---|---|---|---|---|---|
Phone Number | subscriber_phone_number_match | msisdn | phoneNumber_response | phone_number | phoneNumberMatch | ||
(special phone number) | main_subscriber_phone_number_match | mainPhoneNumberMatch | |||||
ID Document | idDocument_response | idDocumentMatch | |||||
Subscriber name | subscriber_name_match | name_score | identity_response | name | nameMatch | ||
(name reading) | subscriber_name_kana_hankaku_match | nameKanaHankakuMatch | |||||
(name reading) | subscriber_name_kana_zenkaku_match | nameKanaZenkakuMatch | |||||
(given name) | given_name_score | (included in identity) | given_name | givenNameMatch | |||
(family name) | family_name_score | (included in identity) | family_name | familyNameMatch | |||
Subsscriber Postal Code | subscriber_postal_code_match | postalCode_score | (included in address) | postal_code | postalCodeMatch | ||
Subscriber Address | subscriber_formatted_match | address_response | address | addressMatch | |||
(street name) | street_name_score | (included in address) | house_or_housename | steetNameMatch | |||
(street number) | (included in address) | streetNumberMatch | |||||
Subscriber Address-Region | subscriber_region_match | regionMatch | |||||
Subscriber Address-Town | locality_score | locality | localityMatch | ||||
Subscriber Address-Country | country_score | country | countryMatch | ||||
Subscriber Birthdate | subscriber_birthdate_match | birthdate_score | birthdate_response | birthdate | birthdateMatch | ||
Subscriber Email Address | email_score | emailMatch | |||||
Subscriber name (Initial of the first Given Name) |
(Initial of the first Given Name) | firstGivenNameMatch | |||||
(All initials of Given Names) | (All initials of Given Names) | allGivenNamesInitialsMatch | |||||
(The first Given Name) | (The first Given Name) | firstGivenNameMatch | |||||
(All Given Names) | (All Given Names) | allGivenNamesMatch | |||||
(Prefixes of the Current Family Name) | (Prefixes of the Current Family Name) | currentFamilyNamePrefixesMatch | |||||
(Family Name at birth) | (Family Name at birth) | familyNameAtBirthMatch | |||||
Subscriber Address (House Number Extension) |
(House Number Extension) | houseNumberExtensionMatch | |||||
Subscriber Gender | subscriber_gender_match | genderMatch | |||||
Thanks, Toshi
Hi all, Toshi again.
I would also like to ask the team if the number of the proposed parameters/attributes is too many or not for the YAML definition. I mean there are some country/market specific attributes already, and these kinds of country/market specific attributes may be expanding in future. Is there any good way (technically) to handle these kinds of country/market specific attributes?
For example, these attributes are categolised as Extended attributes, and these attributes are added 'extended' before attribute names, and any attributes starting with 'extended' are regarded as country/market specific attributes, and they don't need to be included / listed in the YAML definition, but they can be used flexibly for specific countries/markets.
Perhaps, 'polymorphism' and 'schemas inheriting' Gilles pointed out could work for this matter?
I don't think we have to solve this matter for our initial version, though.
Thanks, Toshi
Hi @ToshiWakayama-KDDI , I've a question on partner information (cp_id, service_id) I see in the attributes' list. In 3-Legged or 2-Legged authentication consumer information (partner id) are commonly transmitted in OAuth token. Could you explain why do you think it should be part of definition. thanks a lot
Hi @ToshiWakayama-KDDI , I have some suggestion for your proposal, to see whether it is possible to simplify the list. Regarding 2nd or 3rd or 4th Given Names, it may be better to introduce an attribute of Middle Name(s) in stead. The Given Name is then always a single name, and the Middle Name(s) are then 2nd 3rd, 4th etc Given Name. This is especially imprtant because people not always leave all their given names (usually one ar all).
For the cases where only initials are available, we would use only the initals of the Given Name and the Initials of the Middle Names.
Prefixes is something we can omit from the matching process, as long as it is defined that prefixes are always omitted from the Family Name. For a given area / country, we can define lists of what commonly used prefixes are (for the Netherlands such a list is already available).
If we do this, the list for a compromise can become somewhat shorter:
with kind regards Huub
Hi, Thank you all for the contributions to the debate.
I really think that the list in https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1840159874 is too long. The idea is that too many parameters lead the API clients to have unclear expectations about what is and what is not implemented.
I agree with @ToshiWakayama-KDDI about delaying for future versions parameters that are complex or not clear https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1835634611
Agree with @GillesInnov35 about getting inspiration from TMF 632 https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1835672991 and with @HuubAppelboom from EIDAS2 https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1837259897 I have found this reference for your consideration: https://github.com/eu-digital-identity-wallet/eudi-doc-architecture-and-reference-framework/blob/main/docs/arf.md#5111-pid-attributes-for-natural-persons
So, trying to follow these ideas and trying to think in a Global solution according to https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1838719726, our proposal is to shorten the list of parameters as possible, as long as they have enough semantic for the current requirements. In this sense, this is the example with which we would feel comfortable:
Having said that, I think that, in any case, too many parameters in a plain list may lead the API clients to confusion about what can be used in each country, or in each operator, and what is really implemented in each one of those cases. If clients really need so many options, perhaps Gilles is right in his comment (https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1838719726) and we need to exploit the potential of Inheritance/Polymorphism. I have found this useful reference about this: https://swagger.io/docs/specification/data-models/inheritance-and-polymorphism/ In this way, if needed, perhaps we could separate sets of parameters and specify when and where each set applies. I don't think the CAMARA guidelines (https://github.com/camaraproject/Commonalities/blob/main/documentation/API-design-guidelines.md) say anything about this. So, I think we are pushing the current limits of the CAMARA guidelines. But let me insist that a plain list of too many parameters around the same concept leads the clients to confusion, and get them lost in many options without certainty about what they will get or won't get when making a call to the API.
Last thing is about particles, symbols, etc referred in https://github.com/camaraproject/KnowYourCustomer/issues/18#issuecomment-1837259609 Regardless previous considerations, in order to maximize the matching results, I think we could consider the operator to apply some kind of normalization of the contents of parameters of the request before matching them with the internal information. For example, general rules like lower-casing the characters, removing spaces, dots, hyphens, etc. and even the usual "stop words", will immediately improve the matching results even though we can apply matching scores in next versions.
Regarding what will be used in the eIDAS2 wallet, with ARF version 1.2, there will be a detailed PID Rule Book published, which will be of interest. ARF 1.2 is unfortunately not published yet, but is expected soon.
Regarding the matching process, what we in the Netherlands do is also normalize special characters which are not very commonly used in our area, also because there special characters are often not supported by the CRM systems. Also, for example in the german language there are specific mappings for special characters used. What may be best to define these as part of instructions on how to normalize in a specific country or language area. If both parties apply these rules, you can get as a reward a much higher matching rate; if either party does not, you will get a lower matching rate. It will be very difficult to set rules for this on a global scale, that's why we propose to do this per area (perhaps per country code would be a good thing).
Regarding idCardnumber: in most markets you can have several ID's (for example we have driving license. passport, ID card). In order to make sense out of the matching result, you should communicate back what kind of ID has been matched against. One issue with these idCardnumber, is that as soon as you renew an ID, the number changes, so I doubt whether you will in practice a high match rate.
In general, I am not too worried about the attribute list being a bit long, but more worried about trying to put too many flavours in a single attribute. For example, we tried working with all initials available for the given name, but which resulted in a too low match rate, simply because either side (MNO or Relying party) did not have all initials at their disposal. Same will be the case if you this with given names, or for example an attribute with all the address details in it. The more you try to push things in a single match result, the higher the chance of a mismatch, and that is why we propose to split 1st given name from middle names, streetname from street number, street number extension from street number etc.
hello all, that's good this is a very interesting, we are converging to a solution.
@HuubAppelboom could you complete your proposition with some examples of atributes' value in order to see what kind of information is waited. I don't see clearly how and middleNamesInitialsMatch and middleNamesMatch will be valued (type array or single). Thanks a lot Concerning idDocument if we should to keep it, I think a structure individualIdentification: {name, value} might be used For example [{"national ID card", "124587652"}]. The objective is to be as clear as possible of what refers the id to.
Regards
Hi @Javier, Hi Huub, Hi Gilles,
Thank you for your further comments. I have the same view with Huub that I am not worried about the length of the currently proposed attribute list (mine and Huub's). So, Huub's proposed list (plus cp_id/service_id) would be pretty much fine with me.
I can understand the view of making the attribute list as short and simple as possible, however, currently proposed attributes are required by operators and their customers, so, I think there is no point deleting required attributes in order to make the list simple. (For example, we are providing Matching for the single 'name' attribute and the single/formatted 'address' attribute which our customers need.)
For the API clients, they can use attributes they need and can just ignore attributes they do not need. To avoid their confusion, we can prepare proper description and explanation for each API and further we could prepare some typical examples of attributes set for some typical use cases.
For the operators, they can just ignore requests for attributes they do not have.
So, it is kind of 'the greater embarces the less', and I don't believe Huub's proposed list (plus cp_id/service_id) is too long. Could we accept it for our first version?
Thanks, Toshi
Regarding the middleNames attribute, there is two way we can do this, in case there is more than one middle name.
Take for example: Robertus Mattheus Franciscus Janssen in this, Robertus is the given name (always the first one) Mattheus Franciscus are the middle names Janssen is the familiy name
For Mattheus Franciscus, we could either choose to make it one long string, with everything lowercase, without spaces etc., and hash the result. So in the end you will recieve a hash of "mattheusfranciscus"
The alternative would be to make it a list of middle names, and make a hash of each middle name separately (after making everything lowercase). So then you receive a list of two hashes (of "mattheus" and "franciscus"), and for each hash you will provide a Y/N whether you also have that in your list. (in this I assume the order of the middle names is not that relevant).
Probably the alternative will give a higher match rate, in case only one of the middle names mismatches you still have a partial match. What do you think ?
Adding which type of ID document may be a good idea, but I know this can also become a bit complex a long list. For example, in the Netherlands we also have special ID documents like a permit for fugitives, an id card for embassy staff, etc, etc. Can we agree on a short list of the most common types, and one category Other ? For example Passport, Driving License, IDCard and Other ??
CAMARA KYC Match - Specifications
Bellow a proposal of comparison matrix between different offers' specifications and CAMARA initial requirements proposal. Key points :
Request Specifications
Response Specifications