m-labs / artiq

A leading-edge control system for quantum information experiments
https://m-labs.hk/artiq
GNU Lesser General Public License v3.0
425 stars 196 forks source link

'Persist' dataset entries don't save/dataset_db.pyon stops syncing #1412

Open dccole117 opened 4 years ago

dccole117 commented 4 years ago

Bug Report

One-Line Summary

Dataset_db.pyon intermittently stops syncing; dataset entries revert to old values after restarting artiq.

Issue Details

It seems that the dataset_db.pyon file usually syncs regularly with the data stored in the datasets in the artiq dashboard (as indicated by the fact that 'date modified' time for this file matches the current computer time), and that this reflects the fact that dataset entries with persist = true have been saved. However, we observe that sometimes the dataset_db.pyon file will stop syncing, and when this occurs we can no longer make changes to the dataset that persist after closing and re-opening artiq. We can get dataset_db.pyon to sync regularly again by restarting artiq.

I haven't found a consistent way to trigger this problem, but it seems to have been triggered once by trying to run an experiment (via keyboard shortcut) that hadn't yet been scanned during a repository scan, and once it coincided with a fitting error. At other times it doesn't coincide with an error at all.

Expected Behavior

Changes to the dataset for entries with persist = true should persist after closing and re-opening artiq. This happens most of the time.

Actual (undesired) Behavior

Changes to the dataset are sometimes lost after closing and re-opening artiq. This seems related to dataset_db.pyon not syncing.

Your System

sbourdeauducq commented 4 years ago

Anything in the logs? Try turning on debug output (artiq_master -v)

sbourdeauducq commented 4 years ago

And if that doesn't turn up anything, add debug print() calls to the code that writes to that file.

dccole117 commented 4 years ago

We haven't been experiencing this problem recently, but if/when it comes back I'll try to do some more diagnostics and update.

On Fri, Dec 20, 2019 at 6:57 PM Sébastien Bourdeauducq < notifications@github.com> wrote:

And if that doesn't turn up anything, add debug print() calls to the code that write to that file.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/m-labs/artiq/issues/1412?email_source=notifications&email_token=AJHRWYT4YGI7BMA2TNSNTHDQZVZ2DA5CNFSM4J6B46S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSTOY#issuecomment-568142267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJHRWYVQF4LL7E4PXKSCIDLQZVZ2DANCNFSM4J6B46SQ .

dccole117 commented 4 years ago

Thanks for your suggestions. Here's a bit more info; can you clarify if I've misunderstood something? We aren't seeing this problem now but it's probably too optimistic to assume it won't come back.

Logs: I'm running experiments from the dashboard, and I have the log open with the 'Minimum level: DEBUG' setting. Seems that the output there should be the same as what you're suggesting. When I wrote in my initial post "At other times it doesn't coincide with an error at all," what I meant is that I see dataset_db.pyon disconnect (i.e. it stops updating continuously), but that the logs don't give any hint of anything out of the ordinary.

Debug print() calls: We aren't explicitly writing to dataset_db.pyon. I'm inferring that artiq is frequently updating this file to store the current values of dataset entries with persist = True. Is this not the case? I'm basing this guess on the fact that, until dataset_db.pyon 'disconnects,' I see that its 'date modified' time always matches the current computer time, even without me interacting with artiq/changing a dataset entry.

We are managing the dataset entries with the set_dataset method of the HasEnvironment class. I don't see where I could usefully add print statements to the code we've written - the problem we have is not that set_dataset doesn't update the dataset values that I see in the dashboard (it does this as expected). Instead, the problem is that after dataset_db.pyon disconnects, subsequent changes to dataset entries with persist = True don't persist after closing and re-opening artiq. Is there somewhere else you suggest I try adding some debug print calls?

On Fri, Jan 3, 2020 at 2:07 PM Daniel Cole dccole117@gmail.com wrote:

We haven't been experiencing this problem recently, but if/when it comes back I'll try to do some more diagnostics and update.

On Fri, Dec 20, 2019 at 6:57 PM Sébastien Bourdeauducq < notifications@github.com> wrote:

And if that doesn't turn up anything, add debug print() calls to the code that write to that file.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/m-labs/artiq/issues/1412?email_source=notifications&email_token=AJHRWYT4YGI7BMA2TNSNTHDQZVZ2DA5CNFSM4J6B46S2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHOSTOY#issuecomment-568142267, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJHRWYVQF4LL7E4PXKSCIDLQZVZ2DANCNFSM4J6B46SQ .

sbourdeauducq commented 4 years ago

We aren't explicitly writing to dataset_db.pyon. I'm inferring that artiq is frequently updating this file to store the current values of dataset entries with persist = True. Is this not the case?

This is the case, see artiq.master.databases.DatasetDB._do.

I'm basing this guess on the fact that, until dataset_db.pyon 'disconnects,' I see that its 'date modified' time always matches the current computer time, even without me interacting with artiq/changing a dataset entry.

Based on this symptom it sounds likely that the above-mentioned asyncio task raises an exception and then terminates, stopping the updates. With a recent ARTIQ version, https://github.com/m-labs/artiq/pull/1359 (now part of sipyco) should give you more information.