Closed giacecco closed 9 years ago
Google form
.. and for the bulk uploads?
Let's skip the bulk uploads for now (as discussed in the meeting, we really want these only when we have automatic ingest).
A Google form works, and would let us publish the associated spreadsheet, but are we able to style it or embed it in our pages?
Fine for postponing bulk upload.
I don't think we can publish the spreadsheet back in real-time after users' submissions. Without some degree of moderation - that is not a planned feature for Alpha - people could write anything, for profanities to fictional addresses etc..
I will share this matter with @peterkwells when we catch-up on the phone this morning.
We have a solution here. This is a small app that allows form submissions to be stored in a github-hosted CSV. This is totally what we should do.
The default setup for that app requires users to have a github account. Are we OK with that at this stage, or should we hardcode the account internally to one of our own so it can be anonymous?
To keep the data private, you'd need a Bronze organization account, at $25 a month.
It looks like actually you can easily style Google forms. I think this might be simpler.
OK, I've done a test form submission at https://openaddressesuk.github.io/submit/. Give it a go, see what you think. I've shared the results sheet with @giacecco and @peterkwells. If this is OK, we can create a proper form in the openaddresses organisation and move over to that.
Obviously it needs copy around it to explain what's going on.
Does/can this protect against automated submissions? We don't want anyone to write a script to submit addresses from another source as that could pollute the whole thing.
We have a honeypot field to protect against standard spambots, but it won't protect against a specially-designed attack designed to pollute the data. Unfortunately google forms doesn't support captchas. You'd think it would.
Could we not design the form in such a way that only a browser could submit the info? Or is there a way to log the browser (in a hidden field), so we can easily filter automated submissions?
We could ask people not to do it and explain why. Or see if it's actually a problem and only try to do something about it if/when we find it is.
I'd be kind of inclined to do that one... bit laissez faire I know, but it depends on whether we expect malicious submissions.
I would suggest we start too strong, if anything. The tone/impression that we set when we launch can/will stick. A single fake address that matches one of the ones planted in Addressbase will cause us problems, let alone a mass input.
Given that even though I'd love to do this can we pause this one until we've had time to check the legal advice on the publishing platform approach to publishing addresses that we receive; write some guidelines on what we want/don't want and provide a process for removing infringing material?
There's many examples out there in the Internet to build on and as a well-behaved company we should look at those. As a simple example check out Youtube and their T&C's, community guidelines and notes on copyright:
OK, I'll leave it as-is for now, until you let me know how you want to proceed.
Jeni and I have agreed to:
I'll write up up some more detailed reqs and link them to this issue
@peterkwells In that case, I'll create a proper form in the OA google drive and change over to using that one. Can you detail exactly what fields you want to have, and what the options should be if there are any?
As this isn't going live, I'm going to say that this ticket is done for this sprint. We can open another for the followup work next time.
To be clear, we are not going live until this is done. We are delaying going live to make sure it can be done prior to going live.
:+1:
This decision could still come through this week, so I'm moving this back to ready.
Had a quick catch-up with @peterkwells on this, I believe the thread this far captures the current status of this feature and in particular Peter at https://github.com/theodi/shared/issues/413#issuecomment-60054844 .
One thing to clarify is that the decision on the wording and 'shape' of the form through which we collect the data needs some thinking and possibly validation with Legal. I've suggested @Floppy not to develop the form further until we have revised and detailed specs from @peterkwells on how that form looks like and how it works.
Requirements for review by @giacecco @JeniT @Floppy https://docs.google.com/a/openaddress.es/document/d/1k1HBJ_dCMpfhKPGGPMq4yrq2MbjaW9hwbG4NFGIiq_M/edit
Looks good. The only issue here is a simple Google docs form won't be able to catch the user agent and IP. We could implement a simple server side solution in Heroku though maybe?
If it makes it easier have discussed with Jeni and we agreed to take out UA. Doc is updated.
OK, getting the IP still is going to be difficult. We can capture it via JavaScript, but again, this would be easy to fake and if a user has JS turned off, it won't log. Will this be an issue?
In some ways having something that can only be set through Javascript is helpful as it will enable us to identify scripted submissions.
Or will it... I guess it could be easily faked.
Hmmmmm... Actually, looking at it, I don't think it's possible without using some kind of server side solution. I think the best option would be a simple proxy (probably a Sinatra app) that takes the form contents and submits it to the Google form with the IP included. Probably harder to fake too.
I've built a proxy here https://github.com/OpenAddressesUK/adress-capture. We can then host it on Heroku and add the relevant env variables to the app and change the form on the frontend. Do we have a Gdocs form set up yet? Or shall I do it?
Right, there's stuff here https://github.com/OpenAddressesUK/openaddressesuk.github.io/pull/12
The Board suggested that we collected addresses from day 1, interactively and in bulk.
Even if we are not ready to process them, the most natural thing our website should do is asking people to submit addresses. They imagine a big writing at the top of the landing webpage saying something like "I don't care who you are, but tell me where you are:" [some input boxes], so that in one shot we make clear that we're not putting our nose in the visitor's identity but we need valid addresses. They also suggested we should allow people to submit entire datasets.
What is the easiest way to make this happen without changing the current gh-pages hosted solution and keeping the target date for the "static" website?
There is no need to automatically process the addresses being submitted, nor to publish them back, not even to say how many we got. At this stage, we can do this manually if we think it is suitable.