Closed fathpanus closed 6 years ago
检查下 tarsAdmin有没有部署,是不是活的,看错误是 tarsAdmin 连不上 [ServantProxy::invoke timeout:3000,servant:tars.tarsAdminRegistry.AdminRegObj,func:batchPatch,adaptertcp -h 192.168.204.128 -p 12000,reqid:30]"}]
我这边也遇到了类似问题,具体情况如下:
1、执行重启操作,代码里面第一时间会发送keepalive,通过下面接口查询状态,发现状态正常,proces_id也正确,present_state: active /pages/server/api/server_list?tree_node_id=xx
2、再过3秒查询上面的接口,发现present_state: inactive 检查任务状态( /pages/server/api/task?task_no=xx ),提示: [ServantProxy::invoke timeout:3000,servant:tars.tarsAdminRegistry.AdminRegObj,func:batchPatch,adaptertcp -h xxxx -p 12000,reqid:30]"}]
3、等第二个keepalive再次上报,就恢复正常了
server node的状态是否正常,是通过接收keepalive来判断的吗? 第一个keepalive之后,不知道发生了什么情况,server node的状态从正常被改为inactive。而且第一个keepalive和第二个keepalive之间,间隔在30秒以内
环境信息: php 7.1.23
swoole
swoole support => enabled Version => 4.2.10 Author => Swoole Group[email: team@swoole.com] coroutine => enabled epoll => enabled eventfd => enabled signalfd => enabled cpu_affinity => enabled spinlock => enabled rwlock => enabled sockets => enabled openssl => OpenSSL 1.0.2k-fips 26 Jan 2017 pcre => enabled zlib => enabled mutex_timedlock => enabled pthread_barrier => enabled futex => enabled mysqlnd => enabled async_redis => enabled
Directive => Local Value => Master Value swoole.enable_coroutine => On => On swoole.aio_thread_num => 2 => 2 swoole.display_errors => On => On swoole.use_shortname => On => On swoole.fast_serialize => Off => Off swoole.unixsock_buffer_size => 8388608 => 8388608
用的是基于swoft改造的tars server
问题已经解决,之前出错的原因是:setProcessTitle与tars server的名字不同,导致tars node里面判断pid的时候失败(link)
@cnxzcxy 我也出现demo服务只存活10s的情况,10秒以后服务就会stop,并且只能手动启动,php7.0.2,我检查了服务名没有出错,keepalive里面没有看到关闭服务的操作,可能是什么原因呢?
上传完发布包,点发布;过一会儿,节点状态为"失败"
What language are you using?
php、java
What operating system (Linux, Ubuntu, …) and version?
linux
What runtime / compiler are you using (e.g. jdk version or version of gcc)
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609 java version "1.8.0_181"
Make sure you include information that can help us debug (full error message, exception listing, stack trace, logs). tars.tarsAdminRegistry.log 2018-11-07 15:38:35|35060|DEBUG|AdminRegistryImp::addTaskReq taskNo:9384a35f693e448d861457d19c61200f 2018-11-07 15:38:35|35060|DEBUG|ExecuteTask::addTaskReq, taskNo=9384a35f693e448d861457d19c61200f, size=1, serial=1, userName= 2018-11-07 15:38:35|35060|DEBUG|TaskListSerial::execute 2018-11-07 15:38:35|35065|DEBUG|[TARS]accept [192.168.204.128:37066] [19] incomming 2018-11-07 15:38:35|35780|DEBUG|TaskList::executeSingleTask: taskNo=9384a35f693e448d861457d19c61200f,application=tars,serverName=tarsnotify,nodeName=192.168.204.128,setName=,command=patch_tars 2018-11-07 15:38:35|35780|DEBUG|TaskList::patch:[bak_flag]=[0]|[patch_id]=[53]|[update_text]=[] 2018-11-07 15:38:35|35064|DEBUG|AdminRegistryImp::batchPatch tars.tarsnotify_192.168.204.128||53|||tars.tarspatch.PatchObj|| 2018-11-07 15:38:35|35057|ERROR|[TARS][QueryEpBase::doEndpoints, callback activeEps is empty,objname:tars.tarspatch.PatchObj] 2018-11-07 15:38:35|35063|DEBUG|ExecuteTask::getTaskRsp, taskNo=9384a35f693e448d861457d19c61200f 2018-11-07 15:38:38|35063|DEBUG|ExecuteTask::getTaskRsp, taskNo=9384a35f693e448d861457d19c61200f 2018-11-07 15:38:38|35057|ERROR|[TARS][ObjectProxy::doTimeout, objname:tars.tarspatch.PatchObj, queue timeout error] 2018-11-07 15:38:38|35780|ERROR|TaskList::patch batchPatch error:[ServantProxy::invoke timeout:3000,servant:tars.tarsAdminRegistry.AdminRegObj,func:batchPatch,adaptertcp -h 192.168.204.128 -p 12000,reqid:26] 2018-11-07 15:38:38|35064|ERROR|[TARS]ServantHandle::handleTarsProtocol server unknown exception: ret:-7 msg:[ServantProxy::invoke errno:-7,info:,servant:tars.tarspatch.PatchObj,func:preparePatchFile,reqid:0] 2018-11-07 15:38:38|35057|ERROR|[TARS][AdapterProxy::finishInvoke(ResponsePacket) objname:tars.tarsAdminRegistry.AdminRegObj,get req-ptr NULL,may be timeout,id:26,desc:tcp -h 192.168.204.128 -p 12000 2018-11-07 15:38:41|35063|DEBUG|ExecuteTask::getTaskRsp, taskNo=9384a35f693e448d861457d19c61200f 2018-11-07 15:38:44|35059|DEBUG|updateRegistryInfo2Db affected:2 2018-11-07 15:38:44|35059|DEBUG|loadIPPhysicalGroupInfo get server group from db, records affected:0 2018-11-07 15:38:44|35059|DEBUG|checkRegistryTimeout (150s) affected:0
web部署平台 log: {"level":"info","message":"TaskService.js:32|getTaskRsp: {\"taskItemRsp\":[{\"req\":{\"taskNo\":\"93d129e52bb1403bba8575452f2da36c\",\"itemNo\":\"6cbbbc7e7c9949d593f99327ff005110\",\"application\":\"tars\",\"serverName\":\"tarsnotify\",\"nodeName\":\"192.168.204.128\",\"setName\":\"\",\"command\":\"patch_tars\",\"userName\":\"\",\"parameters\":{\"bak_flag\":\"0\",\"patch_id\":\"53\",\"update_text\":\"\"}},\"startTime\":\"2018-11-07 15:47:47\",\"endTime\":\"\",\"status\":1,\"statusInfo\":\"EM_I_RUNNING\",\"executeLog\":\"\"}],\"taskNo\":\"93d129e52bb1403bba8575452f2da36c\",\"serial\":true,\"userName\":\"\",\"status\":1} ","timestamp":"2018-11-07 15:47:50.868"} {"level":"info","message":"TaskService.js:32|getTaskRsp: {\"taskItemRsp\":[{\"req\":{\"taskNo\":\"93d129e52bb1403bba8575452f2da36c\",\"itemNo\":\"6cbbbc7e7c9949d593f99327ff005110\",\"application\":\"tars\",\"serverName\":\"tarsnotify\",\"nodeName\":\"192.168.204.128\",\"setName\":\"\",\"command\":\"patch_tars\",\"userName\":\"\",\"parameters\":{\"bak_flag\":\"0\",\"patch_id\":\"53\",\"update_text\":\"\"}},\"startTime\":\"2018-11-07 15:47:47\",\"endTime\":\"2018-11-07 15:47:50\",\"status\":3,\"statusInfo\":\"EM_I_FAILED\",\"executeLog\":\"[ServantProxy::invoke timeout:3000,servant:tars.tarsAdminRegistry.AdminRegObj,func:batchPatch,adaptertcp -h 192.168.204.128 -p 12000,reqid:30]\"}],\"taskNo\":\"93d129e52bb1403bba8575452f2da36c\",\"serial\":true,\"userName\":\"\",\"status\":3} ","timestamp":"2018-11-07 15:47:53.890"}