bthuronyi / CloneCoordinate

CloneCoordinate issue tracking
1 stars 0 forks source link

Tighten up when a construct is shown as needing sequencing #111

Closed bthuronyi closed 2 months ago

bthuronyi commented 3 months ago

If one or more mIDs of a construct have been sequenced and show problems, don't automatically kick back every available mID from that construct as being ready to queue sequencing. Instead, make more informed choices about whether it's appropriate to sequence them -- for example, if one mID runs apparently at size on gel, but sequences as bad, then probably should mark it automatically as "check QC" or warn as likely not appropriate to sequence, rather than marking it as ok.

Relatedly, mIDs marked as having issues upon sequencing in Sequencing should automatically show as not appropriate to continue sequencing in Minipreps tab.

evelynqi commented 2 months ago

Code Review: This line AD1547<>"","done", is redundant. There's no case where this happens that isn't caught already by previous code.

Thoughts: Add code to after Z1546<>"","mixed, waiting to submit", that checks whether the template name has previously been sequenced and if there was problems with the previous sequencing. If there was, then check QC to compare this sourceID and the previous sourceID. If it is similar, then it should warn users that the sourceID is not appropriate to continue sequencing.

Actually, now that I write this all out and reread the description, I change my mind: "Relatedly, mIDs marked as having issues upon sequencing in Sequencing should automatically show as not appropriate to continue sequencing in Minipreps tab." This fix should be implemented in the minipreps tab, where similar minipreps should not display as ready to be sequenced when there was a sequence problem with one of them.

bthuronyi commented 2 months ago

Code Review: This line AD1547<>"","done", is redundant. There's no case where this happens that isn't caught already by previous code.

Thoughts: Add code to after Z1546<>"","mixed, waiting to submit", that checks whether the template name has previously been sequenced and if there was problems with the previous sequencing. If there was, then check QC to compare this sourceID and the previous sourceID. If it is similar, then it should warn users that the sourceID is not appropriate to continue sequencing.

It could still be a fair idea to show a warning in Sequencing Status if this same ID previously sequenced poorly. We would probably want to append the warning IFF the sample had not yet been mixed. @ethanjeon #64

bthuronyi commented 2 months ago

Actually, now that I write this all out and reread the description, I change my mind: "Relatedly, mIDs marked as having issues upon sequencing in Sequencing should automatically show as not appropriate to continue sequencing in Minipreps tab." This fix should be implemented in the minipreps tab, where similar minipreps should not display as ready to be sequenced when there was a sequence problem with one of them.

Agreed, this is the primary place to catch a sequencing problem. We have to be careful about when to disqualify the whole set of mIDs based on one of them having sequencing problems, but sometimes it would be appropriate.

evelynqi commented 2 months ago

In my copy, I modified pAJP012 in minipreps and queued it for sequencing as a case to test.

evelynqi commented 2 months ago

TO DO: https://docs.google.com/spreadsheets/d/1YSqAdAVy6jYu_-Nbnop_aQOGMjjLDXzNKm5-Ur23Pcc/edit?usp=sharing

Make changes in Is sequencing appropriate? (Minipreps BC) and it should change miniprep status... REGEXMATCH on sequencing results... template contamination --> don't sequence any preps, primer dimer --> don't sequence preps if it looked OK for sequencing, but others may be valid for sequencing if it originally said Check QC (I feel like the code right now doesn't support the latter, but I will see)

evelynqi commented 2 months ago

Original Code: =if(A2882="","", lambda(x, ifs( join("",x)="OK","Likely OK to sequence", iferror(match("Problem",x,0)),"Do not sequence", TRUE,"Check QC data" )) (iferror(unique( {index(settings_GelCodingQualityFlags,match(AQ2882,settings_GelCodingQuality,0)), index(settings_GelCodingIntensityFlags,match(AR2882,settings_GelCodingIntensity,0)), index(settings_GelSizeOptionsFlags ,match(AU2882,settings_GelSizeOptions,0)), index(Settings_Analytical_PCRflags ,match(BA2882,Settings_analytical_PCR,0)) },TRUE ))))

This only looks at QC before deciding if sequencing is appropriate. To add more information once sequencing of one prep is done, I'm thinking we can implement an if to see if there is any sequencing results for the particular sourceID, which I have done below.

=if(A2881="","", IF(COUNTA(FILTER(Minipreps_m_Sequencing_results, $K$3:K=$K2881)) = 0, lambda(x, ifs( join("",x)="OK","Likely OK to sequence", iferror(match("Problem",x,0)),"Do not sequence", TRUE,"Check QC data" )) (iferror(unique( {index(settings_GelCodingQualityFlags,match(AQ2881,settings_GelCodingQuality,0)), index(settings_GelCodingIntensityFlags,match(AR2881,settings_GelCodingIntensity,0)), index(settings_GelSizeOptionsFlags ,match(AU2881,settings_GelSizeOptions,0)), index(Settings_Analytical_PCRflags ,match(BA2881,Settings_analytical_PCR,0)) },TRUE ))), "Not Empty")

Right now, it outputs "not empty," which we will change to our conditionals. -Implement: Perhaps flags for the sequencing results (in settings, I have a draft) -Look at previous results (which may be contradictory if multiple sequencing attempts were done on different preps or the same prep) -Look at QC across results

evelynqi commented 2 months ago

https://docs.google.com/spreadsheets/d/15I8TH8lMaxw6BRI62DHPu-mRaf_-bdS8T5j5Iewywgs/edit?usp=sharing

evelynqi commented 2 months ago

https://docs.google.com/spreadsheets/d/15I8TH8lMaxw6BRI62DHPu-mRaf_-bdS8T5j5Iewywgs/edit?usp=sharing

This is where you ended last time:

=ifs(A21="","", BH21<>"", if(regexmatch(join("",unique(byrow({settings_SequencingAlignmentOutcomeOptionsFlags,settings_SequencingAlignmentOutcomeOptions},lambda(x,if(iferror(find(index(x,2),BH21)),index(x,1),""))))),"Bad"), "some sequencing of this mID suggests similar things are bad to sequence", "all sequencing of this mID is good or ambiguous"), TRUE,lambda(x, ifs( AND(join("",x)="OK", $Y21="yes - color/phenotype CLEARLY match"), "Likely OK to sequence", OR(iferror(match("Problem",x,0)), $Y21="unexpected color/phenotype", $Y21 = "no"),"Do not sequence", TRUE,"Check QC data" )) (iferror(unique( {index(settings_GelCodingQualityFlags,match(AR21,settings_GelCodingQuality,0)), index(settings_GelCodingIntensityFlags,match(AS21,settings_GelCodingIntensity,0)), index(settings_GelSizeOptionsFlags ,match(AV21,settings_GelSizeOptions,0)), index(Settings_Analytical_PCRflags ,match(BB21,Settings_analytical_PCR,0)) },TRUE ))))

With some thought, I have come up with:

=IF(A20="","", if(counta(filter(Minipreps_m_Sequencing_results, $K$3:K=$K20)) > 0,

if(BH20<>"", if(regexmatch(join("",unique(byrow({settings_SequencingAlignmentOutcomeOptionsFlags,settings_SequencingAlignmentOutcomeOptions},lambda(x,if(iferror(find(index(x,2),BH20)),index(x,1),""))))),"Bad"), "do not sequence", if(regexmatch(join("", unique(byrow(filter(Minipreps_m_Sequencing_results, $K:K=$K20,Minipreps_m_gel_size=AV20), lambda(y, if(y="","",JOIN(" ",unique(byrow({settings_SequencingAlignmentOutcomeOptionsFlags,settings_SequencingAlignmentOutcomeOptions},lambda(x,if(iferror(find(index(x,2),y)),index(x,1),""))))))

)))),"Bad"), "Do not sequence this mID because similar mIDs failed sequencing",

lambda(x, ifs( AND(join("",x)="OK", $Y20="yes - color/phenotype CLEARLY match"), "Likely OK to sequence", OR(iferror(match("Problem",x,0)), $Y20="unexpected color/phenotype", $Y20 = "no"),"Do not sequence", TRUE,"Check QC data" )) (iferror(unique( {index(settings_GelCodingQualityFlags,match(AR20,settings_GelCodingQuality,0)), index(settings_GelCodingIntensityFlags,match(AS20,settings_GelCodingIntensity,0)), index(settings_GelSizeOptionsFlags ,match(AV20,settings_GelSizeOptions,0)), index(Settings_Analytical_PCRflags ,match(BB20,Settings_analytical_PCR,0)) },TRUE ))))))))

which looks at whether there has been a sequencing result of the construct. If there is a sequencing result and the corresponding flag contains "bad", user is told to not sequence. Then, the sheet looks into whether there has been other sequencing attempts of similar looking minipreps of the same construct (similar gel interpretation) and looks to see whether those failed or not. If any failed, that miniprep will be flagged to not be sequenced. Otherwise, the sheet should refer to the other QC data already in place for whether or not sequencing is appropriate. This code works, except there is a bug right now where it returns FALSE when there are no interpretation results.

evelynqi commented 2 months ago

ITS DONE!

https://docs.google.com/spreadsheets/d/15I8TH8lMaxw6BRI62DHPu-mRaf_-bdS8T5j5Iewywgs/edit?usp=sharing

Miniprep Sheet (BD-Is sequencing appropriate?) -Looks at whether there has been a previous sequencing result of that miniprep. If there is a sequencing result and the corresponding flag contains "bad", user is told to not sequence with message "Do not sequence because it has already failed sequencing". Then, the sheet looks into whether there has been other sequencing attempts of similar looking minipreps of the same construct (similar gel interpretation) and looks to see whether those failed or not. If any failed, that miniprep will be flagged to not be sequenced, message: "Do not sequence this mID because similar mIDs failed sequencing". Otherwise, the sheet should refer to the other QC data already in place for whether or not sequencing is appropriate, which returns one of "Likely OK to sequence", "Do not sequence", or "Check QC data".

bthuronyi commented 2 months ago

Looks great!

My one tweak would be to change the wording from "similar mIDs" to "mIDs with similar gel size interpretation" for more transparency. You could also, just to make this A+, make the check more stringent by checking for same gel size intepretation AND same "gel nearest ladder band". That would catch subtle differences that might come up sometimes.

bthuronyi commented 2 months ago

Go ahead and implement in main either way.

evelynqi commented 2 months ago

Implemented in main. Tweaked wording and made the check more stringent by checking for same gel size interpretation AND same "gel nearest ladder band.