biocore / LabControl

lab manager for plate maps and sequence flows
BSD 3-Clause "New" or "Revised" License
2 stars 15 forks source link

Modify generate_prep_information to correctly create prep sheets for sequencing runs based on library plate pools #444

Open charles-cowart opened 5 years ago

charles-cowart commented 5 years ago

"Latest mapping file downloaded is empty, however AGP mapping files in the past were downloaded with no trouble."

Review download buttons from existing plates, etc. and ensure content is downloadable. Where content is not downloadable, or is an invalid file, investigate as to the cause.

AmandaBirmingham commented 5 years ago

We may have more info about what happened here, as a side-effect of today's meeting re miniPCR; below are my notes of what I heard from MacKenzie:

Ran through a single plate (HNRC) that was part of a bigger sequencing run, but only uploaded HNRC to LabControl, but didn't run Prepare Amplicon Sequencing Pool. Note: were still offered options to download sample sheet and prep sheet, but they were corrupt.

So: They were trying to download sample sheets and prep sheets for a library plate pool, not for an amplicon sequencing pool. Betcha a dollar this is the ultimate cause of why the stuff that was downloaded was corrupt/empty.

AmandaBirmingham commented 5 years ago

Hoo doggies--I tracked this one down. The issue is reproducible: go to "Sequencing runs" > "Prepare sequencing run" and select any amplicon library plate pool (not an amplicon sequencing pool). Create the sequencing run (which works just fine) and then go to "Sequencing runs" > "List" and click on the "Download Preparation Sheets" button next to the new sequencing run you just created. A zip file will download but when you go to open it you will get an error like the one below:

Screen Shot 2019-04-19 at 4 23 27 PM

This is because the zip file is in fact empty, and that is because

https://github.com/jdereus/labman/blob/645c798e0094e9962973c2d111c4860b1e8e5b23/labcontrol/db/process.py#L3244

... returns an empty dictionary, and THAT is because the giant sql statement in that function expects the pool composition associated with the sequencing run to be made up of OTHER pool compositions (as amplicon pools are, where you make library plate pools and then pool THEM together to get the pool for your sequencing run):

https://github.com/jdereus/labman/blob/645c798e0094e9962973c2d111c4860b1e8e5b23/labcontrol/db/process.py#L3374-L3380

And the reason the sql was written that was is that when we wrote it, we didn't know that the lab COULD use library plate pools for a sequencing run and just skip the amplicon sequencing pool step.

I believe I can fix this by extracting the logic to determine whether a pool is a library plate pool or an amplicon sequencing pool from

https://github.com/jdereus/labman/blob/645c798e0094e9962973c2d111c4860b1e8e5b23/labcontrol/db/util.py#L29-L35

and extending generate_prep_information to decide whether to expect one layer of pool_compositions or two (and thus deciding on the appropriate sql accordingly) by checking to see whether all the pools in the SequencingProcess are library plate pools (one layer of pool_compositions) or amplicon sequencing pools (two layers of pool_compositions). I will have the code throw an error if there are heterogeneous types of pools in the SequencingProcess, because (a) I cannot think of any way that could actually happen, and (b) if it did I wouldn't know how to handle it :-/