Closed rmadupuri closed 2 years ago
<-----------------------------------------------SV CLASS------------------------------------------------>
HEADER:
Required Cols : Sample_ID, Event_Info else error
Q: Requires column order?
Need Site1_Hugo_Symbol or Site1_Entrez_Gene_Id else error Need Site2_Hugo_Symbol or Site2_Entrez_Gene_Id else error
Need Site1_Exon & Site2_Exon else error Need Site1_Ensembl_Transcript_Id & Site2_Ensembl_Transcript_Id else error
DATA:
If NCBI_Build
Gene identification for Site1 and Site2 - treats as a normal genomic profile, special case for SV? (error) Entrez gene id and gene symbol are both missing (None)
Needs a valid Site1_Hugo_Symbol or Site1_Entrez_Gene_Id else error Needs a valid Site2_Hugo_Symbol or Site2_Entrez_Gene_Id else error
Q: intragenic, deletion cases where no valid symbols? So if gene identification for site1 or site2 is invalid still the validator should pass
TODO: If profile is SV, symbol-Entrez pair can resolve to None. Both Site1 & Site2 can resolve to None? Or at least one is needed?
If Event_Info == Fusion:
The values for Site1, Site1 transcripts & exons are needed (for breakpoint visualization). Else Error
Q: Event_Info is a free text, and it is not always equal to Fusion.
TODO: Check for substring 'fusion' in Event_Info and apply the same conditions? Or is this test even needed?
Check for transcripts and exons from genome nexus - the values in data file should correspond to what is in Genome nexus else error
Each transcript contains known exons in Genome Nexus. Checks the correctness of Transcript-Exon pair.
TODO: Is this functionality needed?
NEW TESTS:
<-----------------------------------------------FUSION CLASS------------------------------------------------>
HEADER:
DATA:
Gene identification - requires valid symbol-entrez pair.
Validates Uniqueness based on Hugo_Symbol, Entrez_Gene_Id, Sample_ID and Fusion cols.
<------------------------------------------GENE PANEL MATRIX CLASS-------------------------------------------->
WGS
, WXS
https://docs.google.com/document/d/17hiqcLGmZCb1wLladmmj6ayUGVnoK6cD8eQvC6Fgqrc/edit
Sample_ID Site1_Hugo_Symbol Site1_Entrez_Gene_ID Site1_Chromosome Site1_Position Site1_Region_Number Site1_Ensembl_Transcript_ID Site2_Hugo_Symbol Site2_Entrez_Gene_ID Site2_Chromosome Site2_Position Site2_Region_Number Site2_Ensembl_Transcript_ID