mbraceproject / MBrace.StarterKit

A collection of demos and tutorials for MBrace
http://mbrace.io
57 stars 34 forks source link

Debugging scripted provisioning #90

Closed cgravill closed 8 years ago

cgravill commented 8 years ago

I'm trying to switch over to the scripted provisioning but haven't been able to get a running cluster. Any suggestions of logs or checks I can do?

Working through the StarterKit I'm able to start ProvisionCluster() but it appears to get stuck provisioning, possibly in a reboot cycle.

Showing the info, the status cycles between "transitioning" and "Provisioning 83.3%"

 Cloud Service "simpleCluster"                                                                                                                     

 Name           Region       VM size  #Instances  Deployment Status   Storage Accnt   ServiceBus Accnt  Last Modified        Cluster Label 
 ----           ------       -------  ----------  -----------------   -------------   ----------------  -------------        ------------- 
 simpleCluster  West Europe  Large    4           Provisioning 83.3%  mbrace99f106be  mbraced753122b    2016-08-05 11:05:43  mbrace-1.4.2  

On the Azure portal the machines are up and seem to switch between "Busy" and "Restarting":

mbrace

I've tried this a few times, waiting an hour, different regions and different machine types and it seems to have the same issue.

For comparison, I configured MBrace.Azure.StandaloneWorker with the storage and service bus settings from this cluster and was able to connect and run:

 Workers                                                                                                                                                                                            

 Id                  Status   % CPU / Cores  CPU Clock   % Memory / Total(MB)  Network(ul/dl : KB/s)  Work items  Hostname       PID  Platform  Initialization Time  Latest Heartbeat    
 --                  ------   -------------  ---------   --------------------  ---------------------  ----------  --------       ---  --------  -------------------  ----------------    
 MACHINE_NAME-p9648  Running        7.5 / 8  3601.0 MHz        63.9 / 16309.0              6.8 / 3.4       0 / 8  MACHINE_NAME  9648   Windows  2016-08-05 12:21:43  2016-08-05 12:22:10 

I'm using the StarterKit from source control, MBrace 1.4.2, Visual Studio 2015 Update 3

isaacabraham commented 8 years ago

Hmmm. You should be able to check on the diagnostics logs of the Azure service to see what's going on. Can you try with a 1 / 2 / 3 nodes as well to see if it's related to that?

Also - if you can hang on for a few days, I've been working with @bruinbrown on getting an ARM-deployable version of MBrace which will run in the App Service - this will simplify the deployment story even more so as it'll just be a standard Azure ARM template.

eiriktsarpalis commented 8 years ago

I'm getting binding redirect issues on the WorkerRole project. Looks like binding redirects were wiped from the relevant App.Config file at some point. I'll push an update to the nuget package asap.

cgravill commented 8 years ago

@isaacabraham an ARM deployable version sounds excellent in lots of ways!

Which logs are you suggesting? The Audit log on the Azure portal is empty. I might be able to enable remote access or debugging to the given machine. I'll try switching down to fewer nodes in the meantime.

eiriktsarpalis commented 8 years ago

Fixed in MBrace.Azure 1.4.3. StarterKit has been updated.

cgravill commented 8 years ago

Thanks Eirik, the update fix it for me.