czbiohub-sf / Rapid-QC-MS

Realtime quality control for mass spectrometry data acquisition
https://czbiohub-sf.github.io/Rapid-QC-MS
Other
13 stars 2 forks source link

Google Drive sync during active run #45

Closed wasimsandhu closed 1 year ago

wasimsandhu commented 2 years ago

The strategy for this seems counterintuitive, but is hopefully instead clever:

  1. Data file runs through QC pipeline
  2. QC results are written to a CSV file
  3. CSV file is uploaded to Google Drive
  4. MS-AutoQC dashboard is refreshed every X mins (because there is an active run)
  5. CSV file is downloaded from Google Drive
  6. Results are loaded from the CSV file instead of the database
  7. At the end of the run, database is synced and CSV files are removed
  8. Run is now marked as completed, so selecting the run will load data from database and not Google Drive

This way, an entire database isn't synced every single time a sample has been completed.

wasimsandhu commented 1 year ago
  1. Progress updates during active instrument run aa5615c412fe2af7ef4c074c79fd3b300561a1ae
  2. Loading data based on run status d45ee35b4c5f65b043cfb6a648c933a47f1e8347
  3. Differentiate between instrument PC and external PC for Google Drive sync 115fbcd1e2ac05b16d53d687029cc1437e448aee
wasimsandhu commented 1 year ago

Need to take into account one more thing: multiple instrument computers means multiple workspaces uploading their database file to Google Drive.

To prevent a database from being overwritten (and data from being lost), I am strategizing the following: MS-AutoQC Google Drive Sync (1)

Instrument computers sync the following files to Google Drive:

  1. Database and methods folder following changes to settings
  2. Database when an active run is started
  3. QC results as CSV files during active instrument run
  4. Database after a completed MS-AutoQC job

In particular, the database is modified only on the following occasions:

  1. Changes to settings
  2. New job started
  3. New job completed

Each time the database is modified, MS-AutoQC must record the modifiedDate of the database file in Google Drive.

Then, MS-AutoQC can use this value to check if the database needs to be synced before a change is made to the database.

wasimsandhu commented 1 year ago

Need to recognize identity of instrument on first-time workspace setup and on workspace login.

  1. Save identity
  2. If an instrument run has started, upload data to Google Drive from the correct instrument PC
wasimsandhu commented 1 year ago

Testing and revising this weekend.

wasimsandhu commented 1 year ago

Each instrument should have its own database to store QC data. Unfortunate oversight in system design, but not a huge setback to implement.