iclab / centinel

http://iclab.org/
MIT License
34 stars 17 forks source link

ERROR: Scheduler sync failed: No JSON object could be decoded #251

Closed sneft closed 8 years ago

sneft commented 8 years ago

We have an installation where the scheduler.info file is repeatedly ending up as a zero-byte empty file. As a result, it fails during sync with the 'No JSON object could be decoded' message.

Deleting the zero-byte scheduler.info file and running 'centinel --sync' solves this problem, since it will fetch a new valid scheduler.info file.

The first question is obviously why scheduler.info is periodically turning into a zero-byte file. The second question is whether centinel should fail this way; if scheduler.info is non-valid should it fetch a valid copy from the server?

Snippet of the error message, which repeats regularly, here:

2016-06-01 08:17:05,544 backend.py(line 283) INFO: Starting sync with https://server.iclab.org:8082
2016-06-01 08:17:10,093 backend.py(line 356) ERROR: Scheduler sync failed: No JSON object could be decoded
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/centinel/backend.py", line 354, in sync
    user.sync_scheduler()
  File "/usr/local/lib/python2.7/dist-packages/centinel/backend.py", line 139, in sync_scheduler
    client_sched = json.load(file_p)
  File "/usr/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
2016-06-01 08:17:12,078 backend.py(line 394) INFO: Finished sync with https://server.iclab.org:8082
2016-06-01 08:17:16,016 client.py(line 145) INFO: Centinel started.
2016-06-01 08:17:16,022 client.py(line 165) ERROR: Failed to load the scheduler: No JSON object could be decoded
2016-06-01 08:17:19,837 backend.py(line 283) INFO: Starting sync with https://server.iclab.org:8082
2016-06-01 08:17:23,773 backend.py(line 356) ERROR: Scheduler sync failed: No JSON object could be decoded
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/centinel/backend.py", line 354, in sync
    user.sync_scheduler()
  File "/usr/local/lib/python2.7/dist-packages/centinel/backend.py", line 139, in sync_scheduler
    client_sched = json.load(file_p)
  File "/usr/lib/python2.7/json/__init__.py", line 290, in load
    **kw)
  File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python2.7/json/decoder.py", line 384, in raw_decode
    raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
2016-06-01 08:17:26,798 backend.py(line 394) INFO: Finished sync with https://server.iclab.org:8082
rpanah commented 8 years ago

Thanks for reporting this. I'll fix it.

sneft commented 8 years ago

In addition to having the scheduler.info file switch to a 0 byte file, these installations are also finding that any edits made to the file locally (say, changing the testing frequency) are being reverted back to the default settings. Typically the user notices this within a day of making any changes.

rpanah commented 8 years ago

@sneft I think this needs to be a separate issue. We need to have a user-based local scheduler that is not overwritten in addition to the one updated by the server to support custom schedulers that are not managed by the server.

rpanah commented 8 years ago

@sneft it seems like Centinel is being run concurrently with itself, and sometimes different instances run into each other while rewriting the scheduler. I'll try to address this by preventing unnecessary writes to the scheduler file as well as disallowing multiple instances of Centinel to run at the same time.

rpanah commented 8 years ago

@sneft the logging system also can't handle multiple instances sharing the same log file.