NYUCCL / psiTurk

An open platform for science on Amazon Mechanical Turk.
https://psiturk.org
MIT License
278 stars 140 forks source link

Host psiTurk experiments on EC2 (Amazon's cloud) #66

Closed jbmartin closed 10 years ago

jbmartin commented 10 years ago

A killer feature for psiTurk would be the ability to host experiments as an instance on Amazon's EC2 cloud servers. This feature would provide a stable server with a dedicated IP, which would be super useful for labtop users.

It should be possible to wrap these instructions into a new command for the CLI. Basically, the new command would log into the users EC2 account, install flask and psiTurk, push the user's experiment code, and then if successful, return/save the experiment's new url.

gureckis commented 10 years ago

yeah i think that would be great and is a natural extension of the current direction we are taking...

gureckis commented 10 years ago

A simple first step might be to just provide instructions on how to obtain a EC2 process to run psiturk from a command line (assuming you don't have a friendly python/unix type environment... like windows or don't have a internet-accessible IP address). This was an issue for some attendees at the psiturk workshop at CogSci last year (windows support).

jbmartin commented 10 years ago

I'm currently playing with flask on an EC2 instance right now. Most of it can be automated, but the problem I forsee is that the setup is non-trivial requiring users to generate keys on the Amazon website and then save them on their local machine. Maybe we can ask users to cut and paste their keys? To get a sense of what I'm talking about, here's the Mac instructions I followed, but I replaced the CLI installation using brew install ec2-api-tools, which made the setup easier.

WIndows is supported.

jodeleeuw commented 10 years ago

The problem that we have run into trying to do this is getting an SSL certificate on an ec2 instance when the ip address is elastic. If you have a domain name that you can link to your ec2 instance then it works, but I haven't figured out how to have a totally free setup with an SSL certificate.

gureckis commented 10 years ago

@jodeleeuw the psiturk ad server feature (https://github.com/NYUCCL/psiTurk/issues/58) should obviate this issue. it is operational in the /dev tree currently. basically, we wrote a server that will seamlessly host your ads for you on psiturk.org using our signed SSL cert. it directs people back to your particular psiturk process once you break free from the <iframe>. we are also working on some cool features associated with the ad server that will be useful (like getting more info about what psych experiments a worker has already completed).

jbmartin commented 10 years ago

@jodeleeuw, are you trying to using EC2 to host MTurk ads in iFrames?

jodeleeuw commented 10 years ago

Right, that makes sense. Yes, we were using the ec2 instance to host the ad and the experiment. Having the adserver would solve the issue.

jbmartin commented 10 years ago

psiTurk 2.0 now has direct support for OpenShift, an alternative to Amazon's EC2. Should this issue stay open?

gureckis commented 10 years ago

I had a random idea that other day. Instead of all this complex hosting stuff could you spin up a EC2 node, then create an SSH tunnel that forwards web traffic from the EC2 to your local computer. Since you initiate the connection via the SSH connection, it will generally work behind both firewalls and wifi routers. This also means you don't have to configure your EC2 node with anything complex software-wise (or deal with uploading new project files all the time to the remote server).

jbmartin commented 10 years ago

Genius. Here's the HOWTO. Easy peezy.

gureckis commented 10 years ago

wow... some of the comments on that make it even easier (http://progrium.com/localtunnel/)

jbmartin commented 10 years ago

oh man that's nice. the only problem for automatic setup is the ruby dependency and gem install. there's a python version, but it's being merged with ngrok.

gureckis commented 10 years ago

ngrok has instruction on running your own server (https://github.com/inconshreveable/ngrok/blob/master/docs/SELFHOSTING.md). i suppose we could make a request with hosting service to enable *.psiturk.org (and get the wildcard SSL cert) and then we could assign users their own random domain for tunneling traffic to their local computer. i think with this we might even be able to offer everyone https:// secured connections for their experiment. it sounds a bit like a technical challenge to implement our own version of this but would solve this hosting issue once and for all.

jbmartin commented 10 years ago

i'm currently running my experiment behind a wifi router using ngrok's server. once it's installed, it's super simple: server on in psiTurk followed by ngrok 22362 on the commandline. it offers a http and https url and an informative http requests log. wow.

as a side note, ngrok is the only free service i can get to work. localproxy won't compile and localtunnel seems to be missing from the gem server.

jbmartin commented 10 years ago

honestly, this is revolutionary.

jbmartin commented 10 years ago

We'd have to setup the ngrokd server on a separate host. The message below is from the DreamHost wiki.

The Secure Hosting service that we provide for our customers does NOT support "wildcard" (*.mydomain.com) type SSL certificates. That means that each domain or sub-domain that you want to set up secure hosting on will require its own unique IP address (IPv4) and SSL certificate. Please don't contact support asking "when will you provide support for wildcard SSL certificates?", it's probably never going to happen. Sorry.

gureckis commented 10 years ago

interesting... well i suppose the ngrokd server could run on a slightly different domain name actually (sciturk.org? psiturk-tunnel.org?). we could eventually move psiturk.org completely someplace new if dreamhost doesn't provide the features we want but might be better to let current things play out since already paid for SSL certs and stuff.

jbmartin commented 10 years ago

All of those sound good. We could also use one of these: psiturkify, turkary, tunnelmate, turkuit, turksmith, tunnelbase, dynatunnel, turkitect, turkius, turkizer, turkmill, turkforge, linktunnel, or tunnellink.

jbmartin commented 10 years ago

You buy; I'll fly...

jbmartin commented 10 years ago

Namecheap looks reasonably priced: https://www.namecheap.com/security/ssl-certificates/wildcard.aspx

jbmartin commented 10 years ago

I think we can get away with pointing the wildcard dns to our current psiTurk server.

gureckis commented 10 years ago

You'll have to explain that to me. namecheap offers weird new TLDs. psiturk.link? psiturk.computer? and of course psiturk.net (could be the network linking domain)? i guess i kind of like psiturk.link

jbmartin commented 10 years ago

RapidSSL and Comodo offer wildcard certs for 2x - 3x as much, but use standard TLDs. I'm fairly confident that we can modify the A record for any of these services to point to psiturk.org's IP where we'll have a ngrok server listening for incoming subdomains.

I should point out that we only have to purchase a wildcard SSL if we want people's experiments to use https. The ad server already provides the critical ssl interface to MTurk, so if you create a subdomain for me e.g., link.psiturk.org, I'll setup a standard, yet sufficient (http) ngrok server on it...

gureckis commented 10 years ago

ok, would link.psiturk.org or tunnel.psiturk.org or something require a unique ip? (e.g., distinct from www.psiturk.org?)

jbmartin commented 10 years ago

that's a good point, and blah.link.psiturk.org might be awkward. do you have another DreamHost domain laying around that we could do some testing on?

gureckis commented 10 years ago

i don’t think it would be awkward. i’m just going to set it up like you requested and need to know if it needs a unique ip. since ip4 are running out it costs so better to do that only if really need it. what i mean is that will ngrok work is link.psiturk.org is itself actually like a virtual host which shared ip address with a bunch of other domains or do it need its own unique ip?

On May 14, 2014, at 2:06 PM, Jay B. Martin notifications@github.com wrote:

that's a good point, and blah.link.psiturk.org might be awkward. do you have another DreamHost domain laying around that we could do some testing on?

— Reply to this email directly or view it on GitHub.

jbmartin commented 10 years ago

Well, according to the ngrok docs,

You need to use the DNS management tools given to you by your provider to create an A record which points *.example.com to the IP address of the server where you will run ngrokd.

which I think will be difficult because we need www, ad, and api to skip the ngrok server.

gureckis commented 10 years ago

i’ll just get a static ip

On May 14, 2014, at 2:18 PM, Jay B. Martin notifications@github.com wrote:

Well, according to the ngrok docs,

You need to use the DNS management tools given to you by your provider to create an A record which points *.example.com to the IP address of the server where you will run ngrokd.

which I think will be difficult because we need www, ad, and api to skip the ngrok server.

— Reply to this email directly or view it on GitHub.

gureckis commented 10 years ago

moved to issue #99