neurodata / ndstore

code for storing neurodata images and image annotations
http://neurodata.io
Apache License 2.0
38 stars 12 forks source link

"private" tilecache? #209

Open jovo opened 8 years ago

jovo commented 8 years ago

does it make for us to set up a tilecache locally, either on dsp or braincloud or something else, so that when other people are using our services, it doesn't interfere with our own science?

@kunallillaney @alexbaden @randalburns

alexbaden commented 8 years ago

Throttling API requests by IP would go much farther than private tilecache in keeping things online. From my experience, visualization doesn't bring us down, massive numbers of get requests do.

Alex On Wed, Jan 27, 2016 at 11:37 joshua vogelstein notifications@github.com wrote:

does it make for us to set up a tilecache locally, either on dsp or braincloud or something else, so that when other people are using our services, it doesn't interfere with our own science?

@kunallillaney https://github.com/kunallillaney @alexbaden https://github.com/alexbaden @randalburns https://github.com/randalburns

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/open-connectome/issues/209.

kunallillaney commented 8 years ago

@jovo We already run a local tilecache. GET requests do slow us down at times but have never brought us down still now. I think throttling based on IP is fine just that it not practical for example in case of VAST or other power users.

jovo commented 8 years ago

is it easy enough to do that? can you estimate # of hours it would take to get it set up?

On Wed, Jan 27, 2016 at 2:39 PM, Alex Baden notifications@github.com wrote:

Throttling API requests by IP would go much farther than private tilecache in keeping things online. From my experience, visualization doesn't bring us down, massive numbers of get requests do.

Alex On Wed, Jan 27, 2016 at 11:37 joshua vogelstein notifications@github.com wrote:

does it make for us to set up a tilecache locally, either on dsp or braincloud or something else, so that when other people are using our services, it doesn't interfere with our own science?

@kunallillaney https://github.com/kunallillaney @alexbaden https://github.com/alexbaden @randalburns https://github.com/randalburns

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/open-connectome/issues/209.

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/open-connectome/issues/209#issuecomment-175816407 .

the glass is all full: half water, half air. neurodata.io

kunallillaney commented 8 years ago

I have no idea or the bandwidth to do this. Maybe @alexbaden knows.

jovo commented 8 years ago

i guess when we are in amazon this won't matter anymore?

kunallillaney commented 8 years ago

I think yes but we will know better after deployment.

jovo commented 8 years ago

ok, let's table this until then.

On Wed, Jan 27, 2016 at 3:58 PM, Kunal Lillaney notifications@github.com wrote:

I think yes but we will know better after deployment.

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/open-connectome/issues/209#issuecomment-175849201 .

the glass is all full: half water, half air. neurodata.io

alexbaden commented 8 years ago

If we are able to standup a service in AWS that allows us to spin up an arbitrary number of gateway servers that pull data from S3, then this will be much less of a problem (it's possible to overwhelm the throughput from S3 to EC2 users if a single chunk of data is being grabbed over and over and over again, but super unlikely). However, we could still have issues with our servers being overwhelmed, depending on how much money we are willing to spend on gateway servers / month (in addition to S3 costs). So, I think even on AWS it's worth considering throttling.

Kunal makes a good point, that throttling by IP isn't ideal. For example, it's possible that a group of people (e.g. lab, office, etc) all share the same IP. So, we'd have to be careful with our throttling and set some sort of generous upper bound.

It seems worth looking into, but it also seems that none of us really have time to dig in to this now. So, maybe throw it on the "rainy day features" pile?

jovo commented 8 years ago

Sounds good On Jan 27, 2016 5:03 PM, "Alex Baden" notifications@github.com wrote:

If we are able to standup a service in AWS that allows us to spin up an arbitrary number of gateway servers that pull data from S3, then this will be much less of a problem (it's possible to overwhelm the throughput from S3 to EC2 users if a single chunk of data is being grabbed over and over and over again, but super unlikely). However, we could still have issues with our servers being overwhelmed, depending on how much money we are willing to spend on gateway servers / month (in addition to S3 costs). So, I think even on AWS it's worth considering throttling.

Kunal makes a good point, that throttling by IP isn't ideal. For example, it's possible that a group of people (e.g. lab, office, etc) all share the same IP. So, we'd have to be careful with our throttling and set some sort of generous upper bound.

It seems worth looking into, but it also seems that none of us really have time to dig in to this now. So, maybe throw it on the "rainy day features" pile?

— Reply to this email directly or view it on GitHub https://github.com/openconnectome/open-connectome/issues/209#issuecomment-175883364 .