broadinstitute / single_cell_portal_core

Rails/Docker application for the Broad Institute's single cell RNA-seq data portal
https://singlecell.broadinstitute.org
BSD 3-Clause "New" or "Revised" License
62 stars 26 forks source link

Fixing issues with reporting ingestSummary events (SCP-5802) #2144

Closed bistline closed 2 weeks ago

bistline commented 3 weeks ago

BACKGROUND & CHANGES

This update addresses some issues surrounding reporting the ingestSummary event for AnnData files. These events are a rollup of the 3 main ingest processes for AnnData files - clustering, metadata, and processed expression. Recent updates to how related jobs were identified proved to be prone to race conditions that meant jobs checking simultaneously would identify each other as "still processing" and both exit, leading to no summary being sent. Now, the query specifically looks for jobs of the 3 main types that are not "done" according to their status in the LifeSciences API. Furthermore, this check can happen after any job that is submitted for this file, be it main ingest, differential expression, subsampling, etc. Once it is determined that there are no more jobs running that are extracting data, a summary is sent, and then that file is marked as having submitted an ingestSummary. No further summaries will be sent for that file.

Additional clusterings or raw counts data can theoretically still be extracted, but neither will result in submitting an ingestSummary - only the standard ingest event will be reported. Since the point of the ingestSummary is to gauge the amount of time it takes a user to go from initial upload to an "initialized" study and whether or not this was successful, this is acceptable.

This also addresses an unrelated bug with deleting AnnData files where DE output files could not be clean up due to missing annotation data (was being used only for logging).

MANUAL TESTING

This is complicated to test, especially since in normal development the race condition can never happen as it requires multiple concurrent job workers processing jobs for the same AnnData file. However, you can start DelayedJob locally with multiple workers using the following command (once your environment is initialized):

bin/delayed_job start development --pool=default:6, --pool=cache:2

Note that once you are done testing, you will need to manually stop the workers as they are launched headlessly. This can be done with bin/delayed_job stop development.

Once your workers are running, follow these steps:

  1. Boot as normal, sign in, and create a new study
  2. Select the AnnData UX and fill out the forms as normal (supplying multiple clusterings and raw counts helps, if possible)
  3. Once the file is uploaded, look in the logs for the extraction job and copy both the filename and MongoDB ID
  4. In a separate terminal, go to the project root and tail your development.log file to look for messages regarding the AnnData summary (your filename/id will be different but the wording will be exact). It will take several minutes to appear, but while you may see multiple "Checking" messages you should only see one message that says "Sending AnnData summary" and no further messages after that:
    $ LOG_MESSAGE="AnnData summary for compliant_pbmc3K.h5ad (66f5b7b38bdcc6e804f001cb)"
    $ tail -f log/development.log | grep -e $LOG_MESSAGE
    ...
    Checking AnnData summary for compliant_pbmc3K.h5ad (66f5b7b38bdcc6e804f001cb) after ingest_expression
    Sending AnnData summary for compliant_pbmc3K.h5ad (66f5b7b38bdcc6e804f001cb) after ingest_expression
  5. Go to the Dev Mixpanel events list and query for your event (note it may take a minute or so for it to appear):
    • event: ingestSummary
    • studyAccession: <your accession>
    • fileName: <your file>

Screenshot 2024-09-26 at 4 49 46 PM

  1. Your event should have the correct number for numFilesExtracted: 1 each for expression/metadata, 1 if you extracted raw counts, and 1 per clustering.
codecov[bot] commented 3 weeks ago

Codecov Report

Attention: Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Project coverage is 69.74%. Comparing base (0eaa7e3) to head (631f415). Report is 15 commits behind head on development.

Files with missing lines Patch % Lines
app/models/ingest_job.rb 92.00% 2 Missing :warning:
Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144/graphs/tree.svg?width=650&height=150&src=pr&token=HMWE5BO2a4&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute)](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute) ```diff @@ Coverage Diff @@ ## development #2144 +/- ## =============================================== + Coverage 69.64% 69.74% +0.09% =============================================== Files 331 331 Lines 27936 27974 +38 Branches 2438 2447 +9 =============================================== + Hits 19456 19510 +54 + Misses 8327 8312 -15 + Partials 153 152 -1 ``` | [Files with missing lines](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute) | Coverage Δ | | |---|---|---| | [app/models/differential\_expression\_result.rb](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144?src=pr&el=tree&filepath=app%2Fmodels%2Fdifferential_expression_result.rb&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute#diff-YXBwL21vZGVscy9kaWZmZXJlbnRpYWxfZXhwcmVzc2lvbl9yZXN1bHQucmI=) | `92.10% <100.00%> (ø)` | | | [app/models/ingest\_job.rb](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144?src=pr&el=tree&filepath=app%2Fmodels%2Fingest_job.rb&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute#diff-YXBwL21vZGVscy9pbmdlc3Rfam9iLnJi) | `51.66% <92.00%> (+1.15%)` | :arrow_up: | ... and [22 files with indirect coverage changes](https://app.codecov.io/gh/broadinstitute/single_cell_portal_core/pull/2144/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=broadinstitute)
bistline commented 2 weeks ago

Functional review: Confirmed one IngestSummary event (note: hidden event in Mixpanel Sending AnnData summary for compliant_pbmc3K.h5ad (66fd3d54eb0c8dcba8f32c7a) after ingest_expression

I extracted processed, metadata, X_tsne, X_umap and raw. The Mixpanel event indicates 4 files extracted - does raw not count? Screenshot 2024-10-02 at 9 08 12 AM

I realize that raw doesn't generate a fragment file but it might count as an "extraction"? Would we want to reflect "raw" in the numFilesExtracted or leave numFilesExtracted as-is and have an array of detailed extraction types. Possible the detailed info isn't useful, don't have enough caffeine in me to have a opinion.

631f415