Closed momothefox closed 9 months ago
@momothefox are the monitoring workers running? The ping checks are performed by the server, the agent is not relevant here.
Yes it is running. The problem is that old devices that has been added before, which has the Graphs on. they are getting updated. but newly added devices. stop creating these graphs. while the OK status is there all the time whenever the device is reachable or not.
# supervisorctl status
celery RUNNING pid 413835, uptime 5:31:52
celery_firmware_upgrader RUNNING pid 413836, uptime 5:31:52
celery_monitoring RUNNING pid 413837, uptime 5:31:52
celery_network RUNNING pid 413838, uptime 5:31:52
celerybeat RUNNING pid 413839, uptime 5:31:52
daphne:asgi0 RUNNING pid 413952, uptime 5:31:45
daphne:asgi1 RUNNING pid 413841, uptime 5:31:52
daphne:asgi2 RUNNING pid 413842, uptime 5:31:52
daphne:asgi3 RUNNING pid 413843, uptime 5:31:52
daphne:asgi4 RUNNING pid 413844, uptime 5:31:52
daphne:asgi5 RUNNING pid 413845, uptime 5:31:52
openwisp2 RUNNING pid 413846, uptime 5:31:52
Go to the devices which do not have the ping charts and verify the "Checks" tab, look for the ping check, if it's not there, create it.
This is the code which creates the ping checks when new devices are created:
If that code above fails for any reason, the check will not be created.
The checks are there
while the code is different than the one you referred to.
@shared_task
def auto_create_ping(
model, app_label, object_id, check_model=None, content_type_model=None
):
"""
Called by django signal (dispatch_uid: auto_ping)
registered in check app's apps.py file.
"""
Check = check_model or get_check_model()
ping_path = 'openwisp_monitoring.check.classes.Ping'
has_check = Check.objects.filter(
object_id=object_id, content_type__model='device', check_type=ping_path
).exists()
# create new check only if necessary
if has_check:
return
content_type_model = content_type_model or ContentType
ct = content_type_model.objects.get(app_label=app_label, model=model)
check = Check(
name='Ping', check_type=ping_path, content_type=ct, object_id=object_id
)
check.full_clean()
check.save()
@shared_task
def auto_create_config_check(
model, app_label, object_id, check_model=None, content_type_model=None
):
"""
Called by openwisp_monitoring.check.models.auto_config_check_receiver
"""
Check = check_model or get_check_model()
config_check_path = 'openwisp_monitoring.check.classes.ConfigApplied'
has_check = Check.objects.filter(
object_id=object_id, content_type__model='device', check_type=config_check_path
).exists()
# create new check only if necessary
if has_check:
return
content_type_model = content_type_model or ContentType
ct = content_type_model.objects.get(app_label=app_label, model=model)
check = Check(
name='Configuration Applied',
check_type=config_check_path,
content_type=ct,
object_id=object_id,
)
check.full_clean()
check.save()
i am using ansible role installation for production.
The difference in code is probably due to a version.
Check the monitoring log in /opt/openwisp2 and ensure it's doing something.
Is the ping not working for all the devices or only some of them?
The celery-monitoring.log showing all the time that everything is fine.
INFO/MainProcess] Task openwisp_monitoring.check.tasks.perform_check[xxxxx] received
INFO/ForkPoolWorker-2] Task openwisp_monitoring.check.tasks.perform_check[xxxxx] succeeded in 0.024974617990665138s: None
some devices are working as expected, others are not.
@nemesifier Should i suspect hardware performance ?
@momothefox I think more of some DB inconsistency. You could try deleting and recreating one of these devices which aren't pinged to see if anything changes.
You could try deleting and recreating one of these devices which aren't pinged to see if anything changes.
I did already. no matter how many times you delete the device, it will never read these graphs, while other devices drawing these graphs. is there any limits ? i have 500+ devices 360 of them being monitored while the rest without monitoring agent.
There aren't any limits at application level. I am not sure what is going on in your case.
nothing like this ?
https://github.com/openwisp/ansible-openwisp2/issues/431#issuecomment-1504078406
also i am using sqlite3 not PostgreSQL or MySQL, is it related to monitoring in anyway?
Also Health Status remains OK whatever the condition of the device is.
I am sorry, I do not know what is wrong with your system, I have no way to verify nor replicate this problem. If you think it's a bug, please provide instructions on how to replicate it, at least tentative. We use the github issues for bug tracking only and not support requests. Please use the support chat if you have further questions.
i am using stable latest release. on fresh installation the charts were working. after a while it stopped working.
after a migration of the server i had to reinstall and backup database. after a while it stopped working again.
if you can help investigate this
https://github.com/openwisp/openwisp-monitoring/assets/25464943/55604df5-2a42-40e6-8d36-3731e8b576bd