NCATComp410 / comp410_spring_2024

COMP410 spring 2024 semester
MIT License
3 stars 0 forks source link

Anonymization Quality Review #71

Open Soyboy3911 opened 7 months ago

Soyboy3911 commented 7 months ago

This issue is focused on evaluating the effectiveness and accuracy of the anonymization procedure used for Personally Identifiable Information (PII) in our project.

SyGitGud commented 7 months ago

No issues were found with CRYPTO anonymization. Although the initial number passes checksum, it begins with two numerical characters and is invalid as a Bitcoin wallet number. The initial address was not anonymized. Anonymization was accurately done to the valid Bitcoin address.

Kyeeshaaa commented 7 months ago

There were no issues found with the URL anonymization, the code successfully replaced the URL so that sensitive/personal information wasn't shown/detected.

oeoloyede commented 7 months ago

I see no issues with the NRP anonymization except for one. When I ran the code it replaced the sensitive information as it should for the mention of someone's nationality but it is incorrectly is used when replacing a passport number.

SyGitGud commented 7 months ago

I see no issues with the NRP anonymization except for one. When I ran the code it replaced the sensitive information as it should for the mention of someone's nationality but it is incorrectly is used when replacing a passport number.

Perhaps anonymization is being triggered by the mention of the specific country's passport?

Soyboy3911 commented 7 months ago

I found no issues with the IT Passport anonymization. However, I believe that it can be improved by replacing the generic placeholder with a more specific one, such as . This will provide clearer context and make it easier to understand the type of information that is being anonymized.

claesmk commented 7 months ago

@Soyboy3911 Do you think NRP should have been IT_PASSPORT?

The passport number on the plane ticket was incorrect because it was an IT passport. The correct number is <NRP>