Open dinosn opened 6 years ago
The issue also has as result the following on the panel,
Issue reported at gogs: https://docs.greenitglobe.com/gig/proj_gig_uk/issues/59
@grimpy @dinosn the memory usage before this happens is high compared to the usage after that
rdb --command memory ~/Documents/dump-before.rdb --bytes 1000
rdb --command memory ~/Documents/dump-after.rdb --bytes 1000
this is the size of the eco:objects key -rw-r--r--. 1 afouda afouda 17M Dec 5 09:45 before.json -rw-r--r--. 1 afouda afouda 2.3M Dec 5 09:44 after.json
from what I see is that the things that took most of the me are eco:objects and queue:stats:min
also some thing I want to know " Was Redis crashed after OOM happened or not
?
cause I see less keys after the error and crashed my cause this problem
we have a limit of 100 m for max mem should we increase this limit
Update found this in redis docs https://redis.io/topics/admin
Make sure to setup some swap in your system (we suggest as much as swap as memory).
If Linux does not have swap and your Redis instance accidentally consumes too much memory,
either Redis will crash for out of memory or the Linux kernel OOM killer will kill the Redis process.
The problem is that eco's have become much larger than before. This should be kept like this, because it makes the eco's a lot better.
Anyway, it seems that this triggers the redis to fill up, so we need to solve this. AFAIK we put the eco's in redis for deduplication. So in stead of putting the ecos completely in redis, we should only put the hash of a calculated value based on the parameters we use for deduplication in redis, and only store the complete object in mongodb
found out that we don't need the whole object but we need specific data which is very tiny(timestamps) to check against, So I will drop the object and keep only required data for doing dedup, hashing is not needed
Ok, great.
Reopening this we still have warnings poping up. Please change the redis max memory allocation to 1gb during the installation.
Currently we have to set it manually for all the environments. Issue https://docs.greenitglobe.com/gig/proj_gig_switzerland/issues/37
We have a few cases in all environments where Redis shows an error as described above.
The case appears on the current OVC 2.2.1 and on the previews version 2.2.0 .
An error as the one below will appear on alerta.
https://alerta.aydo.com/#/alert/7fb54510-e33a-4c97-a34c-1b29efca2889
7fb54510 uk-dc-1: Info - ErrorCondition on 57604fac-22f3-d37f-7f6d-89bd1a7c6b0d could not report job in error to agentcontroller ERROR: Remote Backtrace
Traceback (most recent call last): ~ File "/opt/jumpscale7/lib/JumpScale/grid/serverbase/Daemon.py", line 231, in processRPC result = ffunction(data) ~ File "controller.py", line 661, in notifyWorkCompleted self.-setJob(job, osis=saveinosis) ~ File "controller.py", line 197, in -setJob self.redis.hset("jobs:%s"%job"gid", job"guid", jobs) ~ File "/opt/jumpscale7/lib/redis/client.py", line 1853, in hset return self.execute-command('HSET', name, key, value) ~ File "/opt/jumpscale7/lib/redis/client.py", line 565, in execute-command return self.parse-response(connection, command-name, options) ~ File "/opt/jumpscale7/lib/redis/client.py", line 577, in parse-response response = connection.read-response() ~ File "/opt/jumpscale7/lib/redis/connection.py", line 574, in read-response raise response ~ ResponseError: OOM command not allowed when used memory > 'maxmemory'.
================
Client BackTrace
File "worker.py", line 276, in
worker.run()
File "worker.py", line 205, in run
self.notifyWorkCompleted(job)
File "worker.py", line 237, in notifyWorkCompleted
reportJob()
File "worker.py", line 226, in reportJob
acclient.notifyWorkCompleted(job.--dict--)
File "", line 9, in method
File "/opt/jumpscale7/lib/JumpScale/grid/serverbase/DaemonClient.py", line 282, in sendcmd
return self.sendMsgOverCMDChannel(cmd, args, sendformat, returnformat, category=category,transporttimeout=transporttimeout)
File "/opt/jumpscale7/lib/JumpScale/grid/serverbase/DaemonClient.py", line 196, in sendMsgOverCMDChannel
raise RemoteException("Cannot execute cmd:%s/%s on server:'%s:%s' error:'%s' ((ECOID:%s))" %(category,cmd,ecodict"gid",ecodict"nid",ecodict"errormessage",ecodict"guid"), ecodict)
type/level: UNKNOWN/2 ERROR IN RPC CALL notifyWorkCompleted: ResponseError: OOM command not allowed when used memory > 'maxmemory'.. (Session:{u'roles': u'node', u'storagenode', u'storagedriver', u'storagemaster', u'encrkey': u'', u'nid': 21, u'start': 1511817177, u'netinfo': {u'ip': [u'127.0.0.1', u'mac': u'00:00:00:00:00:00', u'cidr': u'8', u'name': u'lo', u'mtu': 65536}, {u'ip': u'10.16.0.63', u'mac': u'a8:1e:84:96:45:5a', u'cidr': u'24', u'name': u'eno1', u'mtu': 1500}, {u'ip': , u'mac': u'a8:1e:84:96:45:5b', u'cidr': , u'name': u'eno2', u'mtu': 1500}, {u'ip': , u'mac': u'4a:2a:41:8c:d1:aa', u'cidr': , u'name': u'ovs-system', u'mtu': 1500}, {u'ip': u'10.16.1.63', u'mac': u'ec:0d:9a:1c:03:a0', u'cidr': u'24', u'name': u'backplane1', u'mtu': 2000}, {u'ip': , u'mac': u'ec:0d:9a:1c:03:a0', u'cidr': , u'name': u'enp4s0', u'mtu': 2000}, {u'ip': , u'mac': u'ec:0d:9a:1c:03:a1', u'cidr': , u'name': u'enp4s0d1', u'mtu': 2000}, {u'ip': u'10.16.2.63', u'mac': u'80:22:c0:ff:ee:63', u'cidr': u'24', u'name': u'enp4s0f1', u'mtu': 9000}, {u'ip': , u'mac': u'2a:a2:64:07:1b:1c', u'cidr': , u'name': u'enp4s0f1d1', u'mtu': 1500}], u'gid': 888, u'passwd': u'****', u'user': u'', u'organization': u'myorg', u'id': u'888-21-0-72c1db8a-4639-41f0-b78c-a630ad45832c'}) Data:{u'job': {u'timeStop': 1511892964, u'result': u'openvstorage+tcp://10.16.2.63:26203/vm-670/cloud-init-vm-670@94a62f62-fa51-46e1-93df-d36f917b9133', u'errorreport': False, u'guid': u'6be9852e4a374a65bb12c3108b86f569', u'id': 2846, u'category': u'greenitglobe', u'timeStart': 1511892938.855335, u'log': True, u'timeCreate': 1511892938, u'state': u'OK', u'internal': False, u'gid': 888, u'jscriptid': 67, u'parent': None, u'args': {u'userdata': {}, u'type': u'Windows', u'name': u'vm-670', u'metadata': {u'admin-pass': u'6KSycg1gP', u'hostname': u'vm-670'}}, u'nid': 21, u'achost': u'64.253.35.36', u'sessionid': u'888-1-0-15bf7053-2055-44d8-942f-4c811642bb2e', u'wait': True, u'-meta': u'system', u'job', 1, u'roles': u'storagedriver', u'cmd': u'createmetaiso', u'queue': u'', u'timeout': 600, u'resultcode': 0, u'-ckey': u''}}
On the same topic we are starting to see redis reaching memory usage limits allocated to it.
Is it possibly the time to increase redis memory allocation values ?
Issues reported also on gogs at: https://docs.greenitglobe.com/gig/org_support/issues/469