Closed jovo closed 8 years ago
Interesting. My first thought is that we'll have to keep the instance up forever in order for those reproduction instructions to be true, whereas so long as Docker and pip exist the current ones will persist.
It's definitely a higher bar, but eventually expensive to achieve and I predict more likely to break if I were to leave, we change aws accounts, restructure how we want these cloud notebooks to be organized, etc. (& yes, I know you hate my predictions when we don't yet have empirical evidence to back them up, but I'm still saying them because it helps meπ ).
Also, what would be in the notebook? All you need to run are commandline instructions for the pipeline, so would I just os.system
wrap *nix calls?
cool. a few thoughts:
1) my job is to worry about money, not yours :) you worry about more important things, like science.
2) we would use our neurodata AWS credentials, which we will have to get organized (@alexbaden, @randalburns). also, you don't have to worry about what happens if/when you leave, another thing that is in my bailiwick, you worry about more important things, like science.
3) you only expressed negative thoughts. do you only have negative thoughts about this?
4) "notebook" was a paraphrase. if terminal works, that is cool too. we want something not scary to a biologist.
1) ok :) 2) I have the neurodata aws account linked to admin@neurodata.io, & ok 3) I guess that I'm not sure why this is a better thing than writing reproducibility instructions like we have done in the paper? To me it seems like it's more fragile and perhaps more terrifying to a biologist to look at a terminal/notebook of code rather than the occasional explained snippet. I just need what is actually contained to be explained to me because I feel like I must be missing something, hence my lack of excitement. 4) ok
ok, you are missing something :)
get out of your initial fear mindset, re-center somehow, and then think about con's & pro's, and then tell me pro's....
On Wed, Aug 24, 2016 at 9:09 AM, Greg Kiar notifications@github.com wrote:
1) ok :) 2) I have the neurodata aws account linked to admin@neurodata.io, & ok 3) I guess that I'm not sure why this is a better thing than writing reproducibility instructions like we have done in the paper? To me it seems like it's more fragile and perhaps more terrifying to a biologist to look at a terminal/notebook of code rather than the occasional explained snippet. I just need what is actually contained to be explained to me because I feel like I must be missing something, hence my lack of excitement. 4) ok
β You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurodata/extensible-science-paper/issues/2#issuecomment-242057036, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjctR1yi5RXi5qKG2lozXbPyh7kqAmks5qjEKBgaJpZM4Jr5kN .
the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York
Since you tagged me in here...
Maybe this is worth looking at? Best of both worlds? https://aws.amazon.com/ecs/
More explicitly, maybe it's possible to build a workflow using ECS APIs that runs the extensible science pipeline from a jupyter notebook on EC2 with nothing other than EC2 credentials required, but doesn't require us to update an AMI continuously or deal with ownership of anything other than the docker images (which live on the docker hub). I don't actually know whether or not this is true, but would be interested in finding out!
Woah, that's really cool. Setting up ECS with ndmg now. Will report back on how it works! One thing about it so far: it seems to require me to push my image to Amazon rather than just pulling 'latest' which is somewhat annoying for updates, but not at all an issue if we're making a static container setup for reproducibility's sake
@alexbaden let's play with this together, later? Eager to share what I've learned and see if you know a few more things that can help me make this a viable plan forward for our container.
@disa-mhembere @perlman @randalburns perhaps we could all pow-wow tomorrow at 11am to discuss?
:boom:
holy shit!
@alexbaden @disa-mhembere @randalburns
ok, what next?
On Wed, Aug 24, 2016 at 6:37 PM, Greg Kiar notifications@github.com wrote:
[image: screen shot 2016-08-24 at 6 36 50 pm] https://cloud.githubusercontent.com/assets/4883288/17950522/c270d3dc-6a29-11e6-854b-f3b8669f5a37.png
π₯
β You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/neurodata/extensible-science-paper/issues/2#issuecomment-242230277, or mute the thread https://github.com/notifications/unsubscribe-auth/AACjcs-pXzNfXeaYsS6g2fO5fNph_rd1ks5qjMfEgaJpZM4Jr5kN .
the glass is all full: half water, half air. neurodata.io, jovo calendar https://calendar.google.com/calendar/embed?src=joshuav%40gmail.com&ctz=America/New_York
@gkiar brother suggests if we create an AMI (a cheap one) that has everything installed already, then the "reproduction" can merely be "click this link to be brought to our jupyter notebook in the cloud".
this is one step closer....what do you think?
@disa-mhembere @randalburns
the eventual goal would be to have a "launcher" that could link to a wide variety of different "scientific cloud containers for extensible and reproducable research" (siccer)