I have a question related to GDPR compliance needs to delete user data from data lake when user request to delete the account. Currently we are storing user data for data analytics in Azure Data lake with following configuration:
Type: Data Lake Storage Gen1
Data format in Data lake: Avro
Using default partitioning based on time
We are using de-Identified data lake approache to be inline with data privacy challenges by de-identifying and protecting sensitive information before it even enters a data lake. By minimizing the storage and use of personally identifiable information. So before storing data into data lake we are making data with random id. Is it still required to delete the non-personally identifiable information from data lake to be compliance to GDPR? If so, is there an efficient way to delete the user specific data from data lake as azure data lake store is an append-only file system. Data once committed cannot be erased or updated.
Please let me know if you need any further informations.
We are using de-Identified data lake approache to be inline with data privacy challenges by de-identifying and protecting sensitive information before it even enters a data lake. By minimizing the storage and use of personally identifiable information. So before storing data into data lake we are making data with random id. Is it still required to delete the non-personally identifiable information from data lake to be compliance to GDPR? If so, is there an efficient way to delete the user specific data from data lake as azure data lake store is an append-only file system. Data once committed cannot be erased or updated.
Please let me know if you need any further informations.
Thanks a lot for your help in advance.