w3c / w3c-api

The W3C API
https://w3c.github.io/w3c-api/
214 stars 92 forks source link

Looking up users by W3C ID #104

Open marcoscaceres opened 3 years ago

marcoscaceres commented 3 years ago

For the /users/ path, the W3C API relies on a "hash" to represent a user, but it's unclear how the hash is derived. In specs, we usually provide the "w3c_id".... Question: is the "hash" a hash of the W3C ID? (tried md5 and sha-1)

If not, would be be possible to add support for looking up users by their w3c id? This would be great in ReSpec, as we could just provide a list of W3C IDs, and the W3C API could provide all the relevant details (e.g., affiliation).

Alternatively, if I knew how to derive the hash, I could do that via ReSpec to get Editors names and other details.

dontcallmedom commented 3 years ago

AFAIR, these aren't hashes but unguessable strings to prevent from harvesting data on users that are not expected to be exposed - IIRC, the rule is we only expose data about participants in groups.

The path that is used in the IPR checker and should be usable in ReSpec as well is to walk through the list of participants via https://api.w3.org/groups/wgid/participations?embed=1 - then each of the individual participant has its id exposed in its unguessable url endpoint à la https://api.w3.org/users/unguessablestring.

(it may also be that the privacy design of the API should be revisited)

vivienlacourba commented 3 years ago

AFAIR, these aren't hashes but unguessable strings to prevent from harvesting data on users that are not expected to be exposed - IIRC, the rule is we only expose data about participants in groups.

That is correct. (For some history see https://github.com/w3c/w3c-api/issues/46#issuecomment-145887368 and https://github.com/w3c/w3c-api/issues/55)

(it may also be that the privacy design of the API should be revisited)

I'm not opposed to revisiting this.

marcoscaceres commented 3 years ago

Ok, but if we can basically go through group (+W3C API key) to get to the user details, then it seems like it's "privacy by inconvenience" rather than a robust solution.

What might be better is restricting what one could get from /users/w3cid to some acceptable subset of already public information.

For example, this is all public https://www.w3.org/groups/wg/eowg/participants ... the W3C API could return the same data: