Open dccole117 opened 4 years ago
Anything in the logs? Try turning on debug output (artiq_master -v
)
And if that doesn't turn up anything, add debug print()
calls to the code that writes to that file.
We haven't been experiencing this problem recently, but if/when it comes back I'll try to do some more diagnostics and update.
On Fri, Dec 20, 2019 at 6:57 PM Sébastien Bourdeauducq < notifications@github.com> wrote:
And if that doesn't turn up anything, add debug print() calls to the code that write to that file.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/m-labs/artiq/issues/1412?email_source=notifications&email_token=AJHRWYT4YGI7BMA2TNSNTHDQZVZ2DA5CNFSM4J6B46S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSTOY#issuecomment-568142267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJHRWYVQF4LL7E4PXKSCIDLQZVZ2DANCNFSM4J6B46SQ .
Thanks for your suggestions. Here's a bit more info; can you clarify if I've misunderstood something? We aren't seeing this problem now but it's probably too optimistic to assume it won't come back.
Logs: I'm running experiments from the dashboard, and I have the log open with the 'Minimum level: DEBUG' setting. Seems that the output there should be the same as what you're suggesting. When I wrote in my initial post "At other times it doesn't coincide with an error at all," what I meant is that I see dataset_db.pyon disconnect (i.e. it stops updating continuously), but that the logs don't give any hint of anything out of the ordinary.
Debug print() calls: We aren't explicitly writing to dataset_db.pyon. I'm inferring that artiq is frequently updating this file to store the current values of dataset entries with persist = True. Is this not the case? I'm basing this guess on the fact that, until dataset_db.pyon 'disconnects,' I see that its 'date modified' time always matches the current computer time, even without me interacting with artiq/changing a dataset entry.
We are managing the dataset entries with the set_dataset method of the HasEnvironment class. I don't see where I could usefully add print statements to the code we've written - the problem we have is not that set_dataset doesn't update the dataset values that I see in the dashboard (it does this as expected). Instead, the problem is that after dataset_db.pyon disconnects, subsequent changes to dataset entries with persist = True don't persist after closing and re-opening artiq. Is there somewhere else you suggest I try adding some debug print calls?
On Fri, Jan 3, 2020 at 2:07 PM Daniel Cole dccole117@gmail.com wrote:
We haven't been experiencing this problem recently, but if/when it comes back I'll try to do some more diagnostics and update.
On Fri, Dec 20, 2019 at 6:57 PM Sébastien Bourdeauducq < notifications@github.com> wrote:
And if that doesn't turn up anything, add debug print() calls to the code that write to that file.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/m-labs/artiq/issues/1412?email_source=notifications&email_token=AJHRWYT4YGI7BMA2TNSNTHDQZVZ2DA5CNFSM4J6B46S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSTOY#issuecomment-568142267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJHRWYVQF4LL7E4PXKSCIDLQZVZ2DANCNFSM4J6B46SQ .
We aren't explicitly writing to dataset_db.pyon. I'm inferring that artiq is frequently updating this file to store the current values of dataset entries with persist = True. Is this not the case?
This is the case, see artiq.master.databases.DatasetDB._do
.
I'm basing this guess on the fact that, until dataset_db.pyon 'disconnects,' I see that its 'date modified' time always matches the current computer time, even without me interacting with artiq/changing a dataset entry.
Based on this symptom it sounds likely that the above-mentioned asyncio task raises an exception and then terminates, stopping the updates. With a recent ARTIQ version, https://github.com/m-labs/artiq/pull/1359 (now part of sipyco) should give you more information.
Bug Report
One-Line Summary
Dataset_db.pyon intermittently stops syncing; dataset entries revert to old values after restarting artiq.
Issue Details
It seems that the dataset_db.pyon file usually syncs regularly with the data stored in the datasets in the artiq dashboard (as indicated by the fact that 'date modified' time for this file matches the current computer time), and that this reflects the fact that dataset entries with persist = true have been saved. However, we observe that sometimes the dataset_db.pyon file will stop syncing, and when this occurs we can no longer make changes to the dataset that persist after closing and re-opening artiq. We can get dataset_db.pyon to sync regularly again by restarting artiq.
I haven't found a consistent way to trigger this problem, but it seems to have been triggered once by trying to run an experiment (via keyboard shortcut) that hadn't yet been scanned during a repository scan, and once it coincided with a fitting error. At other times it doesn't coincide with an error at all.
Expected Behavior
Changes to the dataset for entries with persist = true should persist after closing and re-opening artiq. This happens most of the time.
Actual (undesired) Behavior
Changes to the dataset are sometimes lost after closing and re-opening artiq. This seems related to dataset_db.pyon not syncing.
Your System