apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
1.99k stars 1.09k forks source link

Failed to create instance in ACS 4.17.2 #9050

Open meisammaleki opened 4 months ago

meisammaleki commented 4 months ago
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
OS / ENVIRONMENT
SUMMARY
STEPS TO REPRODUCE
EXPECTED RESULTS

Here is log for a normal instance creation without any error for queryAsyncJobResult" command :

2024-05-07 09:01:10,310 DEBUG [c.c.a.ApiServlet] (qtp1750498848-40123:ctx-ef88bcba) (logid:9b0854d2) ===START===  192.168.123.126 -- GET  jobId=4a5ecb61-8c89-4a62-8d09-92211a72c099&command=queryAsyncJobResult&response=json
2024-05-07 09:01:10,446 DEBUG [c.c.a.ApiServlet] (qtp1750498848-40123:ctx-ef88bcba ctx-cabcb6a3) (logid:9b0854d2) ===END===  192.168.123.126 -- GET  jobId=4a5ecb61-8c89-4a62-8d09-92211a72c099&command=queryAsyncJobResult&response=json
2024-05-07 09:01:14,898 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-42:ctx-8cd1f844 job-763754 ctx-3d3d3a8d) (logid:4a5ecb61) Complete async job-763754, jobStatus: SUCCEEDED, resultCode: 0, result: org.apache.cloudstack.api.response.UserVmResponse/virtualmachine/{"id":"ac534b8a-5b04-4c05-9d4a-82742edf124d","name":"maleki-meysam","displayname":"maleki-meysam","account":"m.maleki","userid":"6d784d7e-81e5-453b-9b46-b92418970183","username":"m.maleki","domainid":"648c3925-8be4-11e7-90f0-005056a86dbb","domain":"ROOT","created":"2024-05-07T08:54:53+0330","lastupdated":"2024-05-07T09:01:14+0330","state":"Running","haenable":"false","zoneid":"d1a23d3c-fa65-4a30-a0b0-67fba098cc5c","zonename":"AfraCloud01","hostid":"de416fe9-fcdf-488e-8214-b90b26eeb196","hostname":"af-cld-host02.afracloud.local","templateid":"1823a1fc-cf1b-47c2-b282-8c0b1abdee2e","templatename":"Ubuntu 22.04","templatedisplaytext":"Ubuntu 22.04","passwordenabled":"true","serviceofferingid":"f3ee4abc-6fb8-47ff-8825-aab45b0769cd","serviceofferingname":"Custom","cpunumber":"2","cpuspeed":"2000","memory":"2048","guestosid":"7f1cdcd7-64eb-4063-945a-db83fd191054","rootdeviceid":"0","rootdevicetype":"ROOT","securitygroup":[],"password":"A*****","nic":[{"id":"63ffd171-912f-4aac-9cc5-6bc725ff878b","networkid":"211c1610-5dd1-4a0f-9871-0ba16fff0489","networkname":"AF-CLD-INTR17","netmask":"255.255.255.0","gateway":"79.175.167.1","ipaddress":"79.175.167.4","isolationuri":"vlan://1734","broadcasturi":"vlan://1734","traffictype":"Guest","type":"Shared","isdefault":"true","macaddress":"1e:00:bd:01:64:f7","secondaryip":[],"extradhcpoption":[],"deviceid":"0"}],"hypervisor":"VMware","instancename":"i-25540-12429-maleki-meysam","details":{"cpuNumber":"2","dataDiskController":"pvscsi","memory":"2048","cpuSpeed":"2000","memoryOvercommitRatio":"1.1","Message.ReservedCapacityFreed.Flag":"true","cpuOvercommitRatio":"6.0","nicAdapter":"Vmxnet3","rootDiskController":"pvscsi","username":"root"},"affinitygroup":[],"displayvm":"true","isdynamicallyscalable":"true","ostypeid":"7f1cdcd7-64eb-4063-945a-db83fd191054","osdisplayname":"Ubuntu 20.04 LTS","pooltype":"PreSetup","receivedbytes":"0","sentbytes":"0","tags":[],"hasannotations":"false","jobid":"4a5ecb61-8c89-4a62-8d09-92211a72c099","jobstatus":"0"}
2024-05-07 09:01:15,389 DEBUG [c.c.a.ApiServlet] (qtp1750498848-41620:ctx-cc50b563) (logid:19b0e6cb) ===START===  192.168.123.126 -- GET  jobId=4a5ecb61-8c89-4a62-8d09-92211a72c099&command=queryAsyncJobResult&response=json
2024-05-07 09:01:15,478 DEBUG [c.c.a.ApiServlet] (qtp1750498848-41620:ctx-cc50b563 ctx-b61c7a70) (logid:19b0e6cb) ===END===  192.168.123.126 -- GET  jobId=4a5ecb61-8c89-4a62-8d09-92211a72c099&command=queryAsyncJobResult&response=json
ACTUAL RESULTS
2024-05-06 14:49:16,027 DEBUG [c.c.a.ApiServlet] (qtp1750498848-606:ctx-5e692a5b) (logid:20ad87ab) ===START===  80.75.0.10 -- GET  accounttype=0&email=navidsoft1%40gmail.com&firstname=%D9%86%D9%88%DB%8C%D8%AF&lastname=%D9%87%D9%85%D8%A7%DB%8C%D9%88%D9%86%20%D8%B2%D8%A7%D8%AF&username=afra-10946-24542&domainid=f22e82f7-f118-4651-9d28-13c91412819b&command=createAccount&apiKey=vNacmEAZP1kZluwMvwYhfJCnKcGutECQREJyu7pig&signatureversion=3&expires=2024-05-07T00%3A49%3A15%2B0330&response=json&signature=X%2FyERTkyFIn8kjkHR2O06N%2FIYRM%3D
2024-05-06 14:49:16,041 DEBUG [c.c.a.ApiServer] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592) (logid:20ad87ab) CIDRs from which account 'Acct[b3eb3659-77d0-4559-9c79-d81b0ba5fe9e-WHMCS] -- Account {"id": 13, "name": "WHMCS", "uuid": "b3eb3659-77d0-4559-9c79-d81b0ba5fe9e"}' is allowed to perform API calls: 0.0.0.0/0,::/0
2024-05-06 14:49:16,065 DEBUG [o.a.c.a.RoleType] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Role is null; therefore, we use the default account type [NORMAL] value.
2024-05-06 14:49:16,066 DEBUG [o.a.c.a.RoleType] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Role is null; therefore, we use the default account type [NORMAL] value.
2024-05-06 14:49:16,067 DEBUG [o.a.c.a.RoleType] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Role is null; therefore, we use the default account type [NORMAL] value.
2024-05-06 14:49:16,070 DEBUG [c.c.u.AccountManagerImpl] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Access granted to Acct[fb0d5491-e4a2-489a-ad81-32854d711337-WHMCS] -- Account {"id": 13, "name": "WHMCS", "uuid": "fb0d5491-e4a2-489a-ad81-32854d711337"} to Domain:7/WHMCS/ by AffinityGroupAccessChecker
2024-05-06 14:49:16,359 DEBUG [c.c.n.s.SecurityGroupManagerImpl] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Created security group com.cloud.network.security.SecurityGroupVO$$EnhancerByCGLIB$$2f47a71f@3122f0d7 for account id=35048
2024-05-06 14:49:16,359 DEBUG [c.c.u.AccountManagerImpl] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) checking if user of account WHMCS [fb0d5491-e4a2-489a-ad81-32854d711337] with role-id [3] can create an account of type afra-10946-24542 [23ceeff7-23f2-4f47-87c3-5ceeeafd178c] with role-id [4]
2024-05-06 14:49:16,446 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-2:ctx-bb1c7b0c) (logid:7211b470) HA health check task is running...

2024-05-06 14:49:25,757 DEBUG [c.c.a.ApiServlet] (qtp1750498848-593:ctx-33c43241) (logid:9a4ba2f4) ===START===  80.75.0.10 -- GET  jobid=0e8b5de4-5caf-4724-9a7c-9931ab4da92c&command=queryAsyncJobResult&apiKey=vNacmEAZP1kZluwMvwurYCzo85RQcsgIZkCvAnZAeCLN3hfJCnKcGutECQREJyu7pig&signatureversion=3&expires=2024-05-07T00%3A49%3A25%2B0330&response=json&signature=79ud7Af5GDdrwvt7r2c15XZJ8b8%3D
2024-05-06 14:49:25,772 DEBUG [c.c.a.ApiServer] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045) (logid:9a4ba2f4) CIDRs from which account 'Acct[b3eb3659-77d0-4559-9c79-d81b0ba5fe9e-WHMCS] -- Account {"id": 13, "name": "WHMCS", "uuid": "b3eb3659-77d0-4559-9c79-d81b0ba5fe9e"}' is allowed to perform API calls: 0.0.0.0/0,::/0
2024-05-06 14:49:25,788 DEBUG [c.c.a.d.ParamProcessWorker] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) Object entity uuid = 0e8b5de4-5caf-4724-9a7c-9931ab4da92c does not exist in the database.
2024-05-06 14:49:25,788 INFO  [c.c.a.ApiServer] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) Unable to execute API command queryasyncjobresult due to invalid value. Invalid parameter jobid value=0e8b5de4-5caf-4724-9a7c-9931ab4da92c due to incorrect long value format, or entity does not exist or due to incorrect parameter annotation for the field in api cmd class.
2024-05-06 14:49:25,789 DEBUG [c.c.a.ApiServlet] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) ===END===  80.75.0.10 -- GET  jobid=0e8b5de4-5caf-4724-9a7c-9931ab4da92c&command=queryAsyncJobResult&apiKey=vNacmEAZP1kZluw6r7ZPYCzLN3hfJCnKcGutECQREJyu7pig&signatureversion=3&expires=2024-05-07T00%3A49%3A25%2B0330&response=json&signature=79ud7Af5GDdrwvt7r2c15XZJ8b8%3D
2024-05-06 14:49:25,848 DEBUG [c.c.s.StatsCollector] (StatsCollector-6:ctx-3f5f3dea) (logid:661ca3bc) DbCollector is running...
2024-05-06 14:49:25,851 ERROR [c.c.s.StatsCollector] (StatsCollector-6:ctx-3f5f3dea) (logid:661ca3bc) db statistics collection failed due to For input string: "9063910841"
2024-05-06 14:49:25,851 DEBUG [c.c.s.StatsCollector] (StatsCollector-6:ctx-3f5f3dea) (logid:661ca3bc) db statistics collection failed.
java.lang.NumberFormatException: For input string: "9063910841"
        at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.base/java.lang.Integer.parseInt(Integer.java:652)
        at java.base/java.lang.Integer.valueOf(Integer.java:983)
        at com.cloud.server.StatsCollector$DbCollector.getDynamicDataFromDB(StatsCollector.java:739)
        at com.cloud.server.StatsCollector$DbCollector.runInContext(StatsCollector.java:719)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
2024-05-06 14:49:26,211 DEBUG [c.c.u.AccountManagerImpl] (qtp1750498848-606:ctx-5e692a5b ctx-34a34592 ctx-f9ca4a2c) (logid:20ad87ab) Creating user: afra-10946-24542, accountId: 35048 timezone:null
DaanHoogland commented 4 months ago

@meisammaleki , this is a known issue where the value is larger than what fits in an integer, the StatsCollector has been fixed for this in newer versions. I don't think the NumberFormatException has anything to do with a failure to deploy. You'll have to dig more into the logs to find the real cause.

meisammaleki commented 4 months ago

When the load average of mgmt server is between 1 and 1.5, we know there is no problems but sometimes it goes over 2 and it increase until we restart mgmt server. I have no idea for the acceptable value for load average of management nodes. For your information :

meisammaleki commented 4 months ago

@DaanHoogland Hello again I will be thankful if you provide me with some information about my question

DaanHoogland commented 3 months ago

@meisammaleki , sorry for the silence ;)

* There are 2 mgmt server with enough resources

* 40 hosts

* About 4000 instances
  could you please help about ideal load average and how to have more control on it?

these dimension don't seem extreme, thought the 100 vm/host depends a bit on the dimensions of your host of course.

The important bit in your query is this:

2024-05-06 14:49:25,757 DEBUG [c.c.a.ApiServlet] (qtp1750498848-593:ctx-33c43241) (logid:9a4ba2f4) ===START=== 80.75.0.10 -- GET jobid=0e8b5de4-5caf-4724-9a7c-9931ab4da92c&command=queryAsyncJobResult&apiKey=vNacmEAZP1kZluwMvwurYCzo85RQcsgIZkCvAnZAeCLN3hfJCnKcGutECQREJyu7pig&signatureversion=3&expires=2024-05-07T00%3A49%3A25%2B0330&response=json&signature=79ud7Af5GDdrwvt7r2c15XZJ8b8%3D 2024-05-06 14:49:25,772 DEBUG [c.c.a.ApiServer] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045) (logid:9a4ba2f4) CIDRs from which account 'Acct[b3eb3659-77d0-4559-9c79-d81b0ba5fe9e-WHMCS] -- Account {"id": 13, "name": "WHMCS", "uuid": "b3eb3659-77d0-4559-9c79-d81b0ba5fe9e"}' is allowed to perform API calls: 0.0.0.0/0,::/0 2024-05-06 14:49:25,788 DEBUG [c.c.a.d.ParamProcessWorker] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) Object entity uuid = 0e8b5de4-5caf-4724-9a7c-9931ab4da92c does not exist in the database. 2024-05-06 14:49:25,788 INFO [c.c.a.ApiServer] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) Unable to execute API command queryasyncjobresult due to invalid value. Invalid parameter jobid value=0e8b5de4-5caf-4724-9a7c-9931ab4da92c due to incorrect long value format, or entity does not exist or due to incorrect parameter annotation for the field in api cmd class. 2024-05-06 14:49:25,789 DEBUG [c.c.a.ApiServlet] (qtp1750498848-593:ctx-33c43241 ctx-e1cb6045 ctx-8f29104f) (logid:9a4ba2f4) ===END=== 80.75.0.10 -- GET jobid=0e8b5de4-5caf-4724-9a7c-9931ab4da92c&command=queryAsyncJobResult&apiKey=vNacmEAZP1kZluw6r7ZPYCzLN3hfJCnKcGutECQREJyu7pig&signatureversion=3&expires=2024-05-07T00%3A49%3A25%2B0330&response=json&signature=79ud7Af5GDdrwvt7r2c15XZJ8b8%3D

I do not see the job-id being created so why it is invalid I cannot tell you. The format certainly isn't: 0e8b5de4-5caf-4724-9a7c-9931ab4da92c is a valid uuid.

Your actions where through the UI or an API call?

meisammaleki commented 3 months ago

@DaanHoogland you are welcome , the action runs with API call - by WHMCS

meisammaleki commented 3 months ago

I thought vm creation failed comes through the high load mgmt servers , for your information I use same separate environment. That works fine even with load average over 3. If it can be helpful, I can share management properties or default settings on management server

DaanHoogland commented 3 months ago

So can you look for the job-id in your logs and in your DB? it might be that WHMCS uses a wrong id or that for some reason the uuid was removed (or both). strange things happening.

meisammaleki commented 3 months ago

I should waiting for the next time which it happens again then I will inform you