broadinstitute / oncotator

Other
67 stars 33 forks source link

ONP rendering breaks the OxoG filter (Part I -- re-annotation of TCGA MAF and annotation overwriting) #325

Closed LeeTL1220 closed 9 years ago

LeeTL1220 commented 9 years ago

The filter cannot handle values with "|". These are generated in the columns that the OxoG filter needs in the case of ONPs (e.g. t_alt_count), since there are different values for each base. Oncotator behavior is correct, but we need more functionality to make sure that an option exists to run a TCGA MAF through the OxoG filter -- with minimal changes to the oxoG filter code.

There are several possible solutions for this issue:

If one of the latter two are chosen, this is not really an oncotator issue, but we can track it here. Note that only the third choice handles all edge cases (e.g. we have a DNP where one of the two mutations is an artifact)

LeeTL1220 commented 9 years ago

Note that easy re-annotation of TCGA MAF (second bullet point), could just be to cut the columns inside oncotator.

LeeTL1220 commented 9 years ago

FYI... @kcibul and @ldgauthier

I have already discussed with @kcibul and he seems to be on board.

After discussion with Chip @chipstewart the solution to support M2 in FH is as follows:

  1. M2 produces VCF
  2. Oncotator converts and annotates VCF to TCGA MAF without --infer-onps (uses full datasource set per @kcibul request)
  3. OxoG filter runs
  4. Re-annotate with infer-onps and NEW functionality that collapses xNP fields.

How to collapse fields for xNPs: Assume that a field comes in as "x|y|...". For counts:

For other fields: just take the mean

ldgauthier commented 9 years ago

Sounds good.

LeeTL1220 commented 9 years ago

For my own reference (unit tests still to be created) only for the TCGA MAF to TCGA MAF component. Backing code mostly built as of this writing (not shown here) and all regression tests pass, so far.

Still needed to be built:

Additional unit/automated tests to write:

MutationDataFactoryTest

RunSpecFactoryTest (if time permits)

TcgaMafOutputRendererTest

AnnotatorTest

LeeTL1220 commented 9 years ago

When ready, I am going to cut a release with this functionality, only, for GDAC (@dheiman).

The rest of the M2 requirements can be found in issue #326