tweaselORG / meta

(Currently) only used for the issue tracker.
2 stars 0 forks source link

IDs as personal data (legal research) #40

Closed baltpeter closed 7 months ago

baltpeter commented 11 months ago

To us, it seems obvious from reading Art. 4(1) GDPR that cookies IDs, advertising identifiers, etc. should be classified as personal data under the GDPR:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person

Even more so, when read in conjunction with Recital 26 GDPR:

Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person. To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.

However, that view is not universal. As part of #38, I'm conducting an in-depth legal research into different positions on IDs as personal data to inform our own complaints.

baltpeter commented 11 months ago

This inquiry was sparked by the recent-ish (April 2023) decision by the EGC in Case T‑557/20 that discusses IDs as personal data and comes to a different conclusion than we do.

The EGC does not share our interpretation that unique IDs are automatically personal data. It interprets the ECJ Breyer (C‑582/14) ruling such that the classification of data as personal data is relative to a certain party. That is, for data to be considered personal data to a certain party, said party itself needs to have (reasonable and legal, even if just theoretical) access to additional information that would allow it to assign a certain ID to a "legal identity" of a person.

The GDPRhub page on this ruling mentions that the EGC didn't actually decide whether the transmitted data is personal data but only that the EDPS had insufficiently assessed the case to come to such a conclusion. But that is irrelevant—the EGC makes its position on pseudonymous data clear. Assuming its position and that the base facts as lied out in the ruling are correct, the only possible conclusion is that the concerned data was not personal data, regardless of whether the EGC actually stated that.

Also, even though the ruling concerns Regulation 2018/1725 (the counterpart to the GDPR for EU institutions) and not the GDPR, everything in it also applies to the GDPR since the passages it relies on are identical between both (Art. 3(1) Regulation 2018/1725 is identical to Art. 4(1) GDPR, Recital 16 Regulation 2018/1725 is identical to Recital 26 GDPR).

However, having now read a lot of other decisions and rulings on IDs, I am less worried about this ruling in the context of online tracking. In the case discussed here, the only available information was a UUID assigned to a free-text comment[^freetext]. That is not comparable to tracking, where there is a lot more information concerned. The tracking IDs are always transmitted in conjunction with the user's IP (which according to ECJ Breyer is to be considered personal data—for website operators in Germany, at least) and usually additionally in conjunction with lots of other data that allows the creation of detailed profiles. The combination of all this information can definitely be considered personal data and there is nothing in this ruling that would contradict such an interpretation. Still, I would have of course preferred the much easier argument of "ID = personal data".

[^freetext]: Just for completeness' sake: The free-text comments themselves could have also been personal data, but again the EDPS only claimed that but didn't actually investigate it further, so no determination is made in the ruling in that regard, either.

The EGC's argument follows fairly obviously from Breyer:

[…] it must be noted, first of all, that it is common ground that a dynamic IP address does not constitute information relating to an ‘identified natural person’, since such an address does not directly reveal the identity of the natural person who owns the computer from which a website was accessed, or that of another person who might use that computer. Next, […] it must be ascertained whether such an IP address, registered by such a provider, may be treated as data relating to an ‘identifiable natural person’ where the additional data necessary in order to identify the user of a website that the services provider makes accessible to the public are held by that user’s internet service provider. […] The fact that the additional data necessary to identify the user of a website are held not by the online media services provider, but by that user’s internet service provider does not appear to be such as to exclude that dynamic IP addresses registered by the online media services provider constitute personal data within the meaning of Article 2(a) of Directive 95/46. However, it must be determined whether the possibility to combine a dynamic IP address with the additional data held by the internet service provider constitutes a means likely reasonably to be used to identify the data subject. Thus, as the Advocate General stated essentially in point 68 of his Opinion, that would not be the case if the identification of the data subject was prohibited by law or practically impossible on account of the fact that it requires a disproportionate effort in terms of time, cost and man-power, so that the risk of identification appears in reality to be insignificant. Although the referring court states in its order for reference that German law does not allow the internet service provider to transmit directly to the online media services provider the additional data necessary for the identification of the data subject, it seems however, […] in the event of cyber attacks legal channels exist so that the online media services provider is able to contact the competent authority, so that the latter can take the steps necessary to obtain that information from the internet service provider and to bring criminal proceedings. Thus, it appears that the online media services provider has the means which may likely reasonably be used in order to identify the data subject, with the assistance of other persons, namely the competent authority and the internet service provider, on the basis of the IP addresses stored.

It all hinges on what "identifiable" means, and here the ECJ clearly held a different position than we do, which we unfortunately have to accept. However, all hope is not lost because Breyer was based on Directive 95/46 and there are some important differences between that and the GDPR. Compare Art. 2(a) of the directive:

“Personal data” shall mean any information relating to an identified or identifiable natural person (“data subject”); an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number or to one or more factors specific to his physical, physiological, mental, economic, cultural or social identity

with Art. 4(1) GDPR:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

Even though they are largely identical, it is clear that the legislature deliberately wanted to widen the scope. (Side note: Curiously, there is another notable change in the German translation, which changes from using "bestimmte/bestimmbare" in the directive to "identifizierte/identifizierbare" in the GDPR—not sure whether that has any significance.)

Much more importantly, compare the differences between Recital 26 of the directive:

Whereas the principles of protection must apply to any information concerning an identified or identifiable person; whereas, to determine whether a person is identifiable, account should be taken of all the means likely reasonably to be used either by the controller or by any other person to identify the said person; whereas the principles of protection shall not apply to data rendered anonymous in such a way that the data subject is no longer identifiable; whereas codes of conduct within the meaning of Article 27 may be a useful instrument for providing guidance as to the ways in which data may be rendered anonymous and retained in a form in which identification of the data subject is no longer possible.

and Recital 26 GDPR (highlight mine):

The principles of data protection should apply to any information concerning an identified or identifiable natural person. Personal data which have undergone pseudonymisation, which could be attributed to a natural person by the use of additional information should be considered to be information on an identifiable natural person. To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly. To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments. The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable. This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

Here, the difference is much more pronounced and it even more obvious that the legislature wanted to widen the scope significantly. This is pointed out, very helpfully, in a decision by the Swedish IMY (quote (badly) machine translated by Google):

According to IMY, an interpretation of the concept of personal data which means that it must always be demonstrated that there is a legal possibility to link such data to a natural person would, according to IMY, mean a significant limitation of the protection area of the regulation, and open up opportunities to circumvent the protection in the regulation. This interpretation would, among other things, be contrary to the purpose of the regulation according to Article 1.2 of the data protection regulation. The Breyer judgment is decided under the previously applicable directive 95/46 and the concept of "singling out" according to recital 26 of the current regulation (that knowledge of the actual visitor's name or physical address is not required, since the distinction itself is sufficient to make the visitor identifiable), was not specified in previously applicable directives as a method for identifying personal data.

The EGC ruling would definitely make sense to me under the old directive. But under the GDPR, I find it really hard to apply the same argument. As such, I would be reasonably hopeful that the ECJ would come to a different conclusion on the same case, especially as there are already other ECJ cases where they argue that "personal data" has to be interpreted broadly.

For the purposes of our complaints, though, I'm afraid we'll have to argue anticipating a more narrow definition. While some DPAs share our opinion that IDs alone are already personal data, others say that the IDs only become personal data in combination with the other data that is processed (cf. https://github.com/tweaselORG/meta/issues/38#issuecomment-1739101286).

Also, the EDPS already appealed the EGC ruling back in July: https://curia.europa.eu/juris/document/document.jsf?text=&docid=276483&pageIndex=0&doclang=EN&mode=req&dir=&occ=first&part=1 That's good! We'll get a final decision on the matter by the ECJ.

baltpeter commented 11 months ago

I'm in the process of doing a comprehensive review of the legal literature on the subject.

Here are the sources I've gone through (see below for a legend of the labels):

DPD

GDPR


Legend:

baltpeter commented 7 months ago

The Datenschutzkonferenz writes (https://www.datenschutzkonferenz-online.de/media/oh/20221205_oh_Telemedien_2021_Version_1_1_Vorlage_104_DSK_final.pdf, para. 140):

Beinhalten Cookies oder andere auf den Endeinrichtungen der Endnutzenden abgelegte Informationen eine ID, handelt es sich hierbei um personenbezogene Daten.

baltpeter commented 7 months ago

I have published a comprehensive article on my findings a while ago, so this can be closed.