microsoft / Qcodes

Modular data acquisition framework
http://microsoft.github.io/Qcodes/
MIT License
322 stars 309 forks source link

Accessing SQLite Databases Concurrently #2772

Open MichalKononenko opened 3 years ago

MichalKononenko commented 3 years ago

If you encounter a bug use the following template. If you have a feature request feel free to freestyle.

Steps to reproduce

  1. Put the sqlite database on a shared drive
  2. Run a measurement on one computer
  3. Load a dataset on another computer from the shared SQLite database.

Expected behaviour

I expect to be able to load datasets, or inspect them with plottr-inspectr as they are coming in.

Actual behaviour

Database is locked. Unable to open the DB.

System

operating system

Windows 10

qcodes branch

unknown, using version 0.22

qcodes commit

unknown

astafan8 commented 3 years ago

@MichalKononenko what is your use case exactly? (i think i can guess, but i'd rather have you describe it) what is that you're trying to achieve with having the DB file on the network?


just for information, sqlite db files are not meant to be accessed over the network, see here under "Checklist For Choosing The Right Database Engine" section and here.

the fact that it sometimes works (i had users reporting that) is pure luck and is risky. the fact that the database appears locked - could be various reasons thanks to the additional complexity of sqlite over network.


assuming you want to have your measurement data backed up to a network drive for safety, and then you want to inspect it there over the network, at the moment i can suggest the following (the may not sound ideal but they will keep your data safe):

Hope this helps, at least for now.

MichalKononenko commented 3 years ago

@astafan8. Thank you very much for getting back to me. We don't want to use the database for safety, but for access as well. Is there anything in the pipeline for multiple users to view the same experiment data concurrently? We have acquisitions that take several hours to complete, and we want to analyze data while taking the acquisitions.

MichalKononenko commented 3 years ago

@astafan8 , do you think pointing qcodes to a ValentinaDB server may be good for my use case? Have other users tried this?

astafan8 commented 3 years ago

We have acquisitions that take several hours to complete, and we want to analyze data while taking the acquisitions.

as a workaround for now, you could split the data into "units of analysis" and make this unit be a single qcodes dataset, so that once a signle dataset is acquired, it can be immediately read from the DB file itself on the same machine (but in a different process) by the analysis code. sqlite does support multiple reads but a single write. another alternative is to use qcodes.DataSet.to_* methods to store that "unit of analysis" immediately into an xarray netcdf file or pandas something, so that agian the analysis code can read that as soon as the file shows up on disk (or via an explcit trigger from the measurement process) and start processing.

Is there anything in the pipeline for multiple users to view the same experiment data concurrently?

oh yes :) but it will take a bit of time, so can't share any ETA. What will happen is that qcodes DataSet will support more than one storage backend for the raw data (arrays, numerbs,. etc). One of those backends could be of such format that conclurrent reads/writes are possible, say hdf5

do you think pointing qcodes to a ValentinaDB server may be good for my use case? Have other users tried this?

haven't heard of it. but it will be possible once the work mentioned above lands :)

MichalKononenko commented 3 years ago

Hello @astafan8 . Thanks again for the followup. We might have a solution with storing the database locally. Have you considered using sqlalchemy for describing the database? That way, we could point qcodes to any relational DB.

Also, Valentina doesn't do what I need. It's not a drop-in replacement for SQLite.

astafan8 commented 3 years ago

Have you considered using sqlalchemy for describing the database?

nope, that wasn't on the radar really. perhaps this is something that you are willing to try and submit a PR for? i'm not familiar with the powers of sqlalchemy, although heard about it. qcodes' database structure to some extent is not difficult - the main useful tables are "runs" and "experiments" (although "runs" table may potentially contain columns that are not specified in the main schema), and all the other tables are the "results" tables with the raw measurement data that have their own stable schema once the information about parameters is known.

MichalKononenko commented 3 years ago

Hello @astafan8 , thank you for answering this PR again. I just finished my degree, and I'll take this on. I'll fork this repo. Let me know if storage comes up on the radar. It would be my pleasure to be on the team that puts a storage solution together. I'll stay in touch through the Slack.