Y24-313 - PBMC donor pooling - use requested number of samples per pool

KatyTaylor commented 2 months ago

User story As a user of the scRNA Core pipeline, I would like the automatic allocation of samples to pools to behave differently.

Currently, a config table is used that maps number of samples to number of pools (see interim_scrna_core_donor_pooling.yml and comment here). Instead, we would like to specify a number of samples per pool, which could vary from pool to pool - this information will come in at the point of submission (implemented in https://github.com/sanger/sequencescape/issues/4337).

This is a first step in supporting a more flexible pooling strategy, where the customer can specify the number of cells per pool and the number of cells per chip well. This story is to support a pre-MVP so that something can be in place for when R&D do a test run.

Who are the primary contacts for this story Abby, Katy

Who is the nominated tester for UAT Abby

Acceptance criteria To be considered successful the solution must allow:

[x] Instead of using interim_scrna_core_donor_pooling.yml to decide the number and size of the pools, labware creator donor_pooling_plate.rb uses the requested number of samples per pool.
[x] 'Requested samples per pool' is retrieved from the request metadata on the cDNA Prep requests. For now, we can assume it will be the same value for all requests (for one run of the labware creator)
[x] Can assume that the number of samples will be exactly divisible by the 'number of samples per pool'.
[x] Integration Suite test will have to insert the 'Number of samples per pool' value against the requests to carry on working.

Dependencies The number of samples per pool will come in as part of the submission:

https://github.com/sanger/sequencescape/issues/4337

It's not blocking as, for this story, we can just assume it's stored against the request metadata.

Additional context This is for the 'pre-MVP' version, to be used for R&D's end to end test. This will have to be re-jigged for MVP to allow a variable number of samples per pool, and potentially to add logic to achieve an even distribution of concentrations if possible.

Integration suite test PR is here

KatyTaylor commented 2 months ago

Have asked Lesley the following in Slack:

We've got a couple of assumptions we'd like to make to regarding the pooling strategy - just for the 'pre-MVP' version for your end to end test:

The total number of samples will be exactly divisible by the requested number of samples per pool e.g. 80 samples with 10 samples per pool (not 81 / 10 or 80/9)

There will be an even number of pools - this is because we have a known issue where, for an odd number of pools, we get an uneven distribution of samples across the pools. I think you are going to test a) 8 pools of 10, and b) 8 pools of 24.

There will be one study and one cost code per chip, and no 'duplicate' samples that have the same donor Is it OK to make all these assumptions for the 'bare bones' version?

KatyTaylor commented 2 months ago

Looked at this with Andrew. Storing the 'requested number of cells per pool' on request metadata is not straightforward because we'd have to add a new column. Thought of the following options:

New fields on Request Metadata
Retrieve directly from request_options hash on Order (think there is one Order per Study so it's ok at this level)
PolyMetadata against Order or Request
Somewhere else?

dasunpubudumal commented 2 months ago

Looked at this with Andrew. Storing the 'requested number of cells per pool' on request metadata is not straightforward because we'd have to add a new column. Thought of the following options:

New fields on Request Metadata

Retrieve directly from request_options hash on Order (think there is one Order per Study so it's ok at this level)

PolyMetadata against Order or Request

Somewhere else?

oh-oh :D I have done a bit of work on https://github.com/sanger/sequencescape/pull/4346/files. Is it inconsistent with what you guys have discussed? I thought we were going to add a new column.

sanger / limber

Y24-313 - PBMC donor pooling - use requested number of samples per pool #1909