neurodata / sic

Repository for extensible scientific container paper
Apache License 2.0
4 stars 4 forks source link

AMI #2

Closed jovo closed 8 years ago

jovo commented 8 years ago

@gkiar brother suggests if we create an AMI (a cheap one) that has everything installed already, then the "reproduction" can merely be "click this link to be brought to our jupyter notebook in the cloud".

this is one step closer....what do you think?

@disa-mhembere @randalburns

the eventual goal would be to have a "launcher" that could link to a wide variety of different "scientific cloud containers for extensible and reproducable research" (siccer)

gkiar commented 8 years ago

Interesting. My first thought is that we'll have to keep the instance up forever in order for those reproduction instructions to be true, whereas so long as Docker and pip exist the current ones will persist.

It's definitely a higher bar, but eventually expensive to achieve and I predict more likely to break if I were to leave, we change aws accounts, restructure how we want these cloud notebooks to be organized, etc. (& yes, I know you hate my predictions when we don't yet have empirical evidence to back them up, but I'm still saying them because it helps meπŸ˜„ ).

Also, what would be in the notebook? All you need to run are commandline instructions for the pipeline, so would I just os.system wrap *nix calls?

jovo commented 8 years ago

cool. a few thoughts:

1) my job is to worry about money, not yours :) you worry about more important things, like science.

2) we would use our neurodata AWS credentials, which we will have to get organized (@alexbaden, @randalburns). also, you don't have to worry about what happens if/when you leave, another thing that is in my bailiwick, you worry about more important things, like science.

3) you only expressed negative thoughts. do you only have negative thoughts about this?

4) "notebook" was a paraphrase. if terminal works, that is cool too. we want something not scary to a biologist.

gkiar commented 8 years ago

1) ok :) 2) I have the neurodata aws account linked to admin@neurodata.io, & ok 3) I guess that I'm not sure why this is a better thing than writing reproducibility instructions like we have done in the paper? To me it seems like it's more fragile and perhaps more terrifying to a biologist to look at a terminal/notebook of code rather than the occasional explained snippet. I just need what is actually contained to be explained to me because I feel like I must be missing something, hence my lack of excitement. 4) ok

jovo commented 8 years ago

ok, you are missing something :)

get out of your initial fear mindset, re-center somehow, and then think about con's & pro's, and then tell me pro's....

On Wed, Aug 24, 2016 at 9:09 AM, Greg Kiar notifications@github.com wrote:

1) ok :) 2) I have the neurodata aws account linked to admin@neurodata.io, & ok 3) I guess that I'm not sure why this is a better thing than writing reproducibility instructions like we have done in the paper? To me it seems like it's more fragile and perhaps more terrifying to a biologist to look at a terminal/notebook of code rather than the occasional explained snippet. I just need what is actually contained to be explained to me because I feel like I must be missing something, hence my lack of excitement. 4) ok

β€” You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurodata/extensible-science-paper/issues/2#issuecomment-242057036, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjctR1yi5RXi5qKG2lozXbPyh7kqAmks5qjEKBgaJpZM4Jr5kN .

the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York

alexbaden commented 8 years ago

Since you tagged me in here...

Maybe this is worth looking at? Best of both worlds? https://aws.amazon.com/ecs/

alexbaden commented 8 years ago

More explicitly, maybe it's possible to build a workflow using ECS APIs that runs the extensible science pipeline from a jupyter notebook on EC2 with nothing other than EC2 credentials required, but doesn't require us to update an AMI continuously or deal with ownership of anything other than the docker images (which live on the docker hub). I don't actually know whether or not this is true, but would be interested in finding out!

gkiar commented 8 years ago

Woah, that's really cool. Setting up ECS with ndmg now. Will report back on how it works! One thing about it so far: it seems to require me to push my image to Amazon rather than just pulling 'latest' which is somewhat annoying for updates, but not at all an issue if we're making a static container setup for reproducibility's sake

gkiar commented 8 years ago

@alexbaden let's play with this together, later? Eager to share what I've learned and see if you know a few more things that can help me make this a viable plan forward for our container.

jovo commented 8 years ago

@disa-mhembere @perlman @randalburns perhaps we could all pow-wow tomorrow at 11am to discuss?

gkiar commented 8 years ago

screen shot 2016-08-24 at 6 36 50 pm

:boom:

jovo commented 8 years ago

holy shit!

@alexbaden @disa-mhembere @randalburns

jovo commented 8 years ago

ok, what next?

On Wed, Aug 24, 2016 at 6:37 PM, Greg Kiar notifications@github.com wrote:

[image: screen shot 2016-08-24 at 6 36 50 pm] https://cloud.githubusercontent.com/assets/4883288/17950522/c270d3dc-6a29-11e6-854b-f3b8669f5a37.png

πŸ’₯

β€” You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurodata/extensible-science-paper/issues/2#issuecomment-242230277, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjcs-pXzNfXeaYsS6g2fO5fNph_rd1ks5qjMfEgaJpZM4Jr5kN .

the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York