fo-am / waspsurvey

0 stars 0 forks source link

Geographic localization of users #5

Closed alecini closed 3 years ago

alecini commented 3 years ago

We need to know where our users are located. We need broad areas (EUrope, etc as from this it depends which question to provide them) but also a geolocalization at a fine scale. We were thinking at asking: TOWN, COUNTRY, POSTCODE, but we would also like to evaluate the possibilty to implement a sort of map, on which users could click to geolocalize themselves or click on the "i'm here" button. We do not need the GPS precision of meters, actually, for privacy reasons it would be nice to show the geolocalization on the map with a certain buffer (say 10 km). Do you think is it feasible?

nebogeo commented 3 years ago

Geolocalisation, even addresses - will be a privacy and GDPR issue so you'd need to check ethics for the experiment with your research institute to see if you'd be allowed to collect it. We'll also need to encrypt the database in this case.

alecini commented 3 years ago

Yes, but we do not want precise addresses or geolocalization. I'm thinking more at a 10-15 km radius around a place they select (radius can be even bigger). Similarly, we would like to know the town they are from. Are these general information covered by privacy as well? If so, is the same problem applying to europe, Asia etc answer?

I am also thinking, could we have anonimity in the survey? I.e. which kind of personal info are we keeping currently? the IP address?

nebogeo commented 3 years ago

There are two issues here I can see - one is having a map, which is a new feature we haven't discussed previously (similar to the conditional question logic issue, we could potentially do one but it's a big chunk of work so depends on the scale of the other changes required and your priorities).

The other is the privacy issues. GDPR is quite wide ranging and non-specific, anything that could be used to identify the user in any way is covered, so the more you gather the more likely you are to stray into problems. I would say recording farm details and a 10km lat/long would definitely be risky here. I would imagine country + farm details might be Ok, but we'd need to check. Storing IP addresses is in this same category, and another aspect is the "right to be forgotten" where if it isn't entirely anonymous we'll need to have an email address for people to ask to have their data deleted (which must be both in the database and your analysis).

Even for entirely anonymous surveys and experiments we've done we've needed to add a permissions section at the start, and to publish the results you would need to have had an ethics review done before rolling out the survey to make sure all this is correct.

alecini commented 3 years ago

OK, I see. I did not think that GDPR was only if you could link th IP or other personal info together... as we do not want to keep their IP (I'm not saying to do that, rather, I'm just asking if we are currently automatically doing that, I'd say to avoid that!) nor to have a precise address, I thought we were basically asking general questions to an unknown and untraceable user!

However, I'm not an expert in GDPR and I'll ask. Definitely, however, the continent scale is far too big, we mus go at least at the country level! (we did it in the past without any issue on GDPR...maybe we were rather naives!!!!)

nebogeo commented 3 years ago

I would have imagined country + the other demographics we are talking about, must be fine. I am no expert either though!

alecini commented 3 years ago

Just a thought...can we make it completely anonymous? I mean, who wants just go the website and fill the form? Does this prevent privacy issues? Indeed, if we are not storing the IP addressess, if we just ask age sex and country...how will it be possible to trace back the answers (and thus personal data) to the users? Am I wrong? If so, we have to discuss if we want this, as I guess that we will not be able to be sure people will answer just once...

nebogeo commented 3 years ago

Privacy issues are caused by the information you are asking, and how they can be combined together to find out who an individual is. The more questions you ask the less anonymous it is, and the higher risk for the individual it is. In the original example you wanted to store "rough" location in terms of 10km squares - with no other information there is no real privacy issue, but as soon as you start asking about types of farms etc, then it would be quite a simple matter to find a farm of that type within 10km square area.

The issue is that data loss from the website would be your liability, so we either have to add a lot of encryption to make sure this can't happen, as well as a "right to be forgotten" feature which you would need to honor in the analysis too - or we keep it "anonymous enough" with not too many questions to make it no risk and not subject to GDPR.

I agree in that I'd favor the latter, for sure.

We are not storing IP addresses in any case.

alecini commented 3 years ago

I would also prefer second option. I think that if we are not keeping IPs, are only asking country + Farm, there are no big risk of identify people...