Closed rickdesantis closed 9 years ago
My guess is that this is linked to https://github.com/SINTEF-9012/cloudml/issues/34 .
Ok I will have a look at this, are you experiencing this right after a deployment or after a scale out?
It's right after the deploy. It did happen again now, and even without that message in the log (it was just stuck somehow).
can you send me your deployment model? I will test right now
Here it is:
{
"eClass": "net.cloudml.core:CloudMLModel",
"name": "HTTPAgent2AWS",
"providers": [
{
"credentials": "/tmp/scalingrulestests/1307151655/credentialsAmazon.properties",
"eClass": "net.cloudml.core:Provider",
"name": "aws-ec2",
"properties" : [{
"eClass" : "net.cloudml.core:Property",
"name" : "MaxVMs",
"value" : "3"
}]
}
],
"internalComponents": [
{
"eClass": "net.cloudml.core:InternalComponent",
"name": "app",
"resources": [{
"eClass": "net.cloudml.core:Resource",
"name": "startApp",
"startCommand": "bash /home/ubuntu/updateEverything ; sudo bash /home/ubuntu/startHTTPAgent 52.18.138.224 ; source /home/ubuntu/.bashrc ; bash /home/ubuntu/startImperialDC"
}
],
"requiredExecutionPlatform" : {
"eClass" : "net.cloudml.core:RequiredExecutionPlatform",
"name" : "appRequired",
"owner" : "internalComponents[app]"
}
}
],
"internalComponentInstances": [
{
"eClass": "net.cloudml.core:InternalComponentInstance",
"name": "appInstance",
"type": "internalComponents[app]",
"requiredExecutionPlatformInstance" : {
"eClass" : "net.cloudml.core:RequiredExecutionPlatformInstance",
"name" : "appRequiredInstance",
"owner" : "internalComponentInstances[appInstance]",
"type" : "internalComponents[app]/requiredExecutionPlatform[appRequired]"
}
}
],
"vms": [
{
"eClass": "net.cloudml.core:VM",
"imageId": "eu-west-1/ami-a0d797d7",
"is64os": true,
"location": "eu-west-1",
"providerSpecificTypeName": "m3.large",
"maxStorage": "8",
"minStorage": "8",
"name": "HTTPAgent",
"os": "ubuntu",
"privateKey": "/tmp/scalingrulestests/1307151655/desantis-ireland.pem",
"providedExecutionPlatforms": [
{
"eClass": "net.cloudml.core:ProvidedExecutionPlatform",
"name": "HTTPAgentTIER",
"offers": [
{
"eClass": "net.cloudml.core:Property",
"name": "OS",
"value": "Ubuntu"
}
],
"owner": "vms[HTTPAgent]"
}
],
"provider": "providers[aws-ec2]",
"securityGroup": "default",
"sshKey": "desantis-ireland"
}
],
"vmInstances": [
{
"eClass": "net.cloudml.core:NodeInstance",
"name": "httpAgentInstance131",
"type": "vms[HTTPAgent]",
"providedExecutionPlatformInstances": [
{
"eClass": "net.cloudml.core:ProvidedExecutionPlatformInstance",
"name": "httpAgentTier",
"owner": "vmInstances[httpAgentInstance131]",
"type": "vms[HTTPAgent]/providedExecutionPlatforms[HTTPAgentTIER]"
}
]
}
],
"executesInstances": [
{
"eClass": "net.cloudml.core:ExecuteInstance",
"name": "runApp",
"providedExecutionPlatformInstance": "vmInstances[httpAgentInstance131]/providedExecutionPlatformInstances[httpAgentTier]",
"requiredExecutionPlatformInstance": "internalComponentInstances[appInstance]/requiredExecutionPlatformInstance[appRequiredInstance]"
}
]
}
Thank you!
Thanks, last question, you are using the remoteFacade to deploy?
CloudML cml= Factory.getInstance().getCloudML("ws://127.0.0.1:9000");
No, I'm using the web socket interface directly.
I deployed three times your model and I have always been able to retrieve the model with:
!getSnapshot { path : / }
Do you receive the ack that state that your deployment is completed? Did you change something in the model you pasted compared to the one you use? Because the log bellow mean that the engine is not able to serialize the runtime model.
Jul 13, 2015 2:04:58 PM org.cloudml.codecs.JsonCodec save
SEVERE: null
I also tested to retrieve status of specific instances without any problem:
!getSnapshot
path : /componentInstances[name='httpAgentInstance131']
Hi Riccardo,
I did quite some changes to the mechanism to update the status of the VMs, from my side it seems to work fine. Please, let me know if it also solve the problem from your side.
Hi Nicolas, I'm not getting it anymore, but the thing is that I don't have a way to quickly replicate it, as it is happening randomly.
It isn't showing anymore for now, so probably it is solved! Closing the bug, then. Thank you!
Hello, I'm having some problems while asking the status of the system right after the deploy (which did go well).
Asking a
won't return anything, and will only show in the CloudML-Shell.log file as a
with no other data returned. Waiting (also minutes) after a deploy won't help me here, and this is happening multiple times now. It did work only the 7th time after 6 consecutive failed attempts.