sanger / sequencescape

Web based LIMS
MIT License
80 stars 32 forks source link

DPL-663 Bioscan - RESEARCH the creation of a compound sample #3750

Open andrewsparkes opened 1 year ago

andrewsparkes commented 1 year ago

Description Research the creation of a compound sample using the 9216 samples from the XP tube, using own wrapper if needed, with specific Bioscan validation. To be called from within an async Sequencescape job worker.

Who the primary contacts are for this work PSD

Knowledge or Stake holders PSD

Additional context or information The existing compound sample code in Cardinal creates the samples in the Sequencing Request class and requires the 3rd 'depth' tag parameter. We would want to create a compound sample in an asynchronous job worker triggered by a button on the Limber XP tube screen, and would not have that depth tag parameter or be doing sequencing yet. We want the sample so there is only one exported to Traction instead of 9216.

andrewsparkes commented 1 year ago

At the most basic level you can turn any sample into a compound sample by just updating its 'component_samples' as follows:

context 'Can create a compound sample containing 9216 component samples for Bioscan' do
    let!(:compound_sample) { create(:sample) }
    let(:component_samples) { create_list :sample, 9216 }

    before do
      compound_sample.update(component_samples: component_samples)
    end

    it 'contains the expected number of samples' do
      expect(compound_sample.component_samples.count).to eq(9216)
      expect(compound_sample.component_samples).to match_array component_samples
    end
  end

The above test takes 2mins 43s to run on my laptop. Although note that this includes the time to create the 9216 samples, the actual join is probably fairly quick.

See app/models/compound_aliquot.rb This is a factory class for creating Aliquots with compound samples in them. Called from Request::SampleCompoundAliquotTransfer, in the context of a Request.

Has validations: That tag depth is unique for each sample That all samples are in the same study That all samples are in the same project Checks that compound sample doesn’t already exist for source_aliquots list

Options for Bioscan:

  1. Create the compound aliquot and sample on transferring into the final XP tube. The compound aliquot and sample is created in a similar wrapper as in the CompoundAliquot. However, a change would need to be made as the rules for creating the compound sample would be different. Then the compound sample reference is exported to Traction when the user clicks the Export to Traction button on the XP tube in Limber. Async job creates the mBrave file and exports the compound sample UUID and XP tube barcode to traction via RabbitMQ message. Pro's: compound sample is linked to XP tube directly. Con's: XP tube creation takes longer. And we don't need the compound sample Limber side so there is little advantage to making it at this point.
  2. User clicks the Export to Traction button on the XP tube in Limber that contains 9216 samples. This creates an async job in Sequencescape. First step in the job is to fetch the samples for the XP tube. Next we create the compound sample from the list of samples (from the aliquots) in the XP tube. We assume as pooling has happened that there are no tag clashes. We do not need to create a compound aliquot or link it to the tube. We do not use the existing Cardinal code that includes tag depth, we just create a new sample for the current Study (get from the first component sample) and set the list of samples from the XP tube as it's component samples (have a check to fetch in case same combination was made previously). Then we create the mBrave file using the list of samples and their metadata. Then the compound sample UUID is exported to traction along with the source XP tube barcode via RabbitMQ message. Pro's: Creation of compound sample happens asynchronously. Con's: Compound sample is in Limbo, not linked to the tube. Does that matter though? Could add a comment to the tube with the compound sample details when we export it to Traction?

Option 2 is simpler. As we don't really care about the compound sample on the Limber side.

andrewsparkes commented 1 year ago

Options for naming for compound sample: Given we can't combine sample names (9216), or source plate barcodes (96), or 384 plate barcodes (24) because that would be impractical, we have limited options. There should not be more than one compound sample for a group of samples.

  1. Name after the XP tube, and optionally add a suffix e.g. compoundsample\<tube barcode>
  2. Name after the XP tube, and include Bioscan as a prefix e.g. bioscan_compoundsample\<tube barcode>
  3. other?
andrewsparkes commented 1 year ago

Validations for Bioscan compound sample creation: Assumptions:

  1. Samples are already in the same Study/Project.
  2. Samples have already been checked for tag clashes when pooled into XP tube

Validations/checks:

  1. Do not create a new compound sample if one already exists for this list of XP tube samples

Steps:

  1. Fetch samples for XP tube
  2. Check if compound sample already exists with this list of samples, return it if it does
  3. Fetch Study from first XP tube sample
  4. Create a new compound sample within Study
  5. Link all the XP tube samples to the new compound sample as component_samples
  6. Return compound sample