cfe-lab / proviral

0 stars 0 forks source link

Contig without conseq #3

Closed donkirkby closed 3 years ago

donkirkby commented 3 years ago

Reported by @dmacmillan:

I found out why S16 is not showing up in my latest outcome summary report. It actually does have some (but very few) remapped reads, which means it isn't entirely "missing". What is happening is my outcome summary class is processing all of the conseqs first, and then all of the contigs. It notices that S16 has no conseq yet it does have a contig. I actually don't have a logical path for what to do in these cases because I had previously assumed that if a sample has a contig it should have a corresponding conseq. I've put in some logic to handle it but left a comment for you Don because it may not be the desired logic for everyone.

donkirkby commented 3 years ago

I think we need to review how to combine the conseq results with the contig results. Another example is S39 from the 02-Feb-2021.M05995 run. It has two contigs:

  1. reverse primer not found in contig or conseq
  2. both primers failed validation, all coverage was below 100, so no conseq reported

There was only one conseq reported, so it should be summarized as a primer error. However, because there were two contigs, it got reported as multiple contigs.

Proposal for output summary:

  1. If conseqs passed, produce output summary from them.
  2. Else if contigs passed, produce output summary from them.
  3. Else if there are no conseqs, produce output summary from contigs.
  4. Else produce output summary from conseqs.
donkirkby commented 3 years ago

When analyzing contigs or conseqs, filter out all sequences that did not BLAST to HIV.

  1. If no sequences BLAST to HIV, generate error "non-HIV".
  2. If there are no sequences at all for a sample, generate error "no sequence".

When generating output summary for a sample:

  1. If conseqs passed, produce output summary from them.
  2. Else if contigs passed, produce output summary from them.
  3. Else if there are no conseqs, and contig error is "non-HIV" or "no sequence", report that.
  4. Else if there are no conseqs, generate error "low coverage".
  5. Else produce output summary from conseqs.