CatalystCode / azure-flocker-driver

A flocker driver for Azure Storage
11 stars 3 forks source link

This looks really cool #15

Open lukemarsden opened 8 years ago

lukemarsden commented 8 years ago

What is the current status? Are you planning to work on it further? Is there anything I can do to help?

sedouard commented 8 years ago

Hey @lukemarsden, we hit a couple bumps on the road here due to some limitations of the Azure Disk API and we weren't expecting those fixes for a while. Let me circle back with some of my colleagues in redmond to see whats up.

ferrantim commented 8 years ago

Hi @sedouard Any update on this? We're getting more and more requests for an Azure driver (I work at ClusterHQ). Thanks!

sedouard commented 8 years ago

hey @ferrantim @lukemarsden I'm starting work today and this week again but I anticipate this as only preview/demo as there are some problems with our Disk API. I'm hoping that the implementation will highlight these problems for azure engineering. There are some improvements that are anticipated for these apis but not for a while unfortunately. Keep an eye on the repo for updates.

sedouard commented 8 years ago

hey @ferrantim, @lukemarsden sorry for the delay here. I've been pushing code to the feat_arm branch just to test the Disk APIs through Azure resource manager, since we now have some semi-decent ARM support in the python SDK.

The main problem for the holdup here are reliability problems with the Azure Disk API that I've been able to demonstrate to Azure engineering with code from this repo. Currently there is an issue where a virtual machine will become unusable after repeated attaches and detaches of disks which is no good. It's being tracked internally and I'll keep you updated as the status changes.

wallnerryan commented 8 years ago

@sedouard thanks for the update, any news since Jan?

sedouard commented 8 years ago

Hey @wallnerryan, They fixed the 'vm is broken after attach' issue pretty quickly since that was a bigger deal.

I built out a simple test suite using the ARM disk api's and simple attaching, then detaching a disk can leave the VM disagreeing with the Azure API as to what is attached and in what slot. Azure engineering hasn't provided us any developers to investigate our instances that currently have the reproduced issue. We also haven't got anyone to try the reproduction steps we've provided them.

If we can get more +1's on the issue, it would help push engineering more to fix the disk bugs which make attaching/detaching disks reliable. Flocker isn't the only platform hit by this,, however. It's pretty much anyone that needs this functionality.

@madhana is the product manager who might have more insight as to when this will get more attention.

wallnerryan commented 8 years ago

Thanks @sedouard :+1: , we had a few users asking for it recently, can see if I can get them to +1

chazkii commented 8 years ago

+1

devteng commented 8 years ago

+1

imranraja85 commented 8 years ago

+1

mwilmes commented 8 years ago

+1

wallnerryan commented 8 years ago

@sedouard @madhana getting more +1's. Let us know how we can help / feel free to reach out via email

sedouard commented 8 years ago

Thanks for the +1's all.

Engineering actually started investigating this issue this past Monday. They're still sifting through logs trying to find the root cause for the attach/detach discrepency. Will keep this thread updated.

wallnerryan commented 8 years ago

Thanks! Good to hear !

On May 24, 2016, at 2:49 PM, Steven Edouard notifications@github.com wrote:

Thanks for the +1's all.

Engineering actually started investigating this issue this past Monday. They're still sifting through logs trying to find the root cause for the attach/detach discrepency. Will keep this thread updated.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

adambarthelson commented 8 years ago

voytoo commented 8 years ago

+1

Any progress with the issue?

rbj325 commented 8 years ago

+1

Sorry you had to deal with Azure support. It can get rough.

sedouard commented 8 years ago

Hey guys! Unfortunately this issue has outlasted my employment at Microsoft!

I've handed this off to @jmspring. He's driving the support ticket internally with azure engineering. Seeing as I don't have access to view the issue anymore maybe @madhana can fetch more details on what the ticket status is.

adambarthelson commented 8 years ago

@sedouard just have someone share their active Microsoft/Azure credentials on here so the guys can contribute, I'm sure nothing bad will happen.

jmspring commented 8 years ago

There is some traction on this issue.

I'm on vacation until Sunday and will respond more Monday, but it is being investigated.

-j

Sent from my iThingy

On Jun 23, 2016, at 22:55, Steven Edouard notifications@github.com wrote:

Hey guys! Unfortunately this issue has outlasted my employment at Microsoft!

I've handed this off to @jmspring. He's driving the support ticket internally with azure engineering. Seeing as I don't have access to view the issue anymore maybe @madhana can fetch more details on what the ticket status is.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

wallnerryan commented 8 years ago

@sedouard thanks for helping and good luck in your next opportunity! Looking forward to more into @jmspring thanks.

jmspring commented 8 years ago

The product team has suggested some modifications to the driver (and a couple of other fixes) which I should be getting to later this week/early next week.

jmspring commented 8 years ago

Ok, due to another project, I got delayed. I'm starting work on this starting tomorrow.

apobbati commented 8 years ago

+1

ckarlsen84 commented 8 years ago

+1

jmspring commented 8 years ago

Current status:

Work is progressing on the Azure issues, setup/install and docs are next on my list, then updating the unit tests. Not ready for prime time, but a few steps closer.

adambarthelson commented 8 years ago

jmspring commented 8 years ago

Update -

To install: git clone https://github.com/CatalystCode/azure-flocker-driver.git cd azure-flocker-driver sudo /opt/flocker/bin/pip install .

Configuration: Look at azure-flicker-driver/example.azure_agent.yml for agent.yml contents.

wallnerryan commented 8 years ago

@jmspring thanks for the update! This is good news. So in terms of basic usage sounds like its working, could you give some more details on the issue of ARM doesn't like multiple updates to the same VM happening in parallel, handling of this and hardening is needed

jmspring commented 8 years ago

@wallnerryan - take for example you have a detach in progress (yet not done) and then shortly there after do an attach, the representation of the VM for the second operation will likely contain the data disk being detached. Thus a conflict results. The operation may fail or the drive being detached ends up still attached.

jmspring commented 8 years ago

Note - work will be pulled into master now.

wallnerryan commented 8 years ago

@jmspring thanks! Will looks forward to testing this out soon. cc @pcgeek86

jmspring commented 8 years ago

Current state - the driver test takes about 25-30min to run, but having it go over night (about 30 tries) none failed.

When running under Flocker, there is the occasional case of a disk attach timeout -- think ping pong a volume between two VMs. I haven't looked closely at upping the docker/flocker timeout config.

I believe most of the issues originally encountered are ironed out, but testing and use are needed to guarantee.

Performance of attach/detach is usually between 30 and 60sec per operation. This is on the todo list to look into.

nag-iit commented 7 years ago

+1