getzlab / deTiN

DeTiN is designed to measure tumor-in-normal contamination and improve somatic variant detection sensitivity when using a contaminated matched control.
BSD 3-Clause "New" or "Revised" License
49 stars 21 forks source link

Fails without INDEL data #25

Closed lDesiree closed 4 years ago

lDesiree commented 4 years ago

Hi Amaro,

Thanks a lot for sharing this useful tool with the community! I want to run the tool without INDEL data but get the following error:

pre-processing SSNV data initialized TiN to 0 TiN inference after 1 iterations = 0.0 SSNV based TiN estimate converged: TiN = 0.0 based on 8320 sites calculating aSCNA based TiN estimate using data from chromosomes: [ 8 10 19] aSCNA based TiN estimate: TiN = 0.0 Traceback (most recent call last): File "deTin/deTiN.py", line 607, in main() File "deTin/deTiN.py", line 565, in main do = output(di, ssnv_based_model, ascna_based_model) File "deTin/deTiN.py", line 260, in init if self.input.indel_table.isnull().values.sum() == 0: AttributeError: 'list' object has no attribute 'isnull'

I also tried to use an empty file with the headers from the example indel data but got the following error:

Traceback (most recent call last): File "deTin/deTiN.py", line 607, in main() File "deTin/deTiN.py", line 537, in main di.read_and_preprocess_data() File "deTin/deTiN.py", line 223, in read_and_preprocess_data self.read_and_preprocess_SSNVs() File "deTin/deTiN.py", line 208, in read_and_preprocess_SSNVs self.indel_table = du.read_indel_vcf(self.indel_file, self.seg_table, self.indel_type) File "/Users/schnidd/Downloads/deTiN-master/deTiN/deTiN_utilities.py", line 562, in read_indel_vcf counts_format = indel_table['format'][0].split(':') AttributeError: 'float' object has no attribute 'split'

The tool is working if I use the example data and also when I use my data plus the full INDEL data from the example files. Could you possibly help me to understand the cause of this issue?

amarotaylor commented 4 years ago

Oh weird! I can no longer edit the code since I switched institutes. However I don't think this should be an issue. From that error message it looks like the table is skipping the header or something? indel_table['format'][0] should contain a string that details how the VCF is formatted. The method should also run without indels. What do you get when you run without the indel flag?

lDesiree commented 4 years ago

Thank you for your reply. If I try to run it without the indel data I get the following error: pre-processing SSNV data initialized TiN to 0 TiN inference after 1 iterations = 0.0 SSNV based TiN estimate converged: TiN = 0.0 based on 8320 sites calculating aSCNA based TiN estimate using data from chromosomes: [ 8 10 19] aSCNA based TiN estimate: TiN = 0.0 Traceback (most recent call last): File "deTin/deTiN.py", line 607, in main() File "deTin/deTiN.py", line 565, in main do = output(di, ssnv_based_model, ascna_based_model) File "deTin/deTiN.py", line 260, in init if self.input.indel_table.isnull().values.sum() == 0: AttributeError: 'list' object has no attribute 'isnull'

amarotaylor commented 4 years ago

Hey

Sorry for this. The fix for this is to add this line before line 260: if self.input.indel_file != 'None':

See code below.

` def init(self, input, ssnv_based_model, ascna_based_model):

    # previous results
    self.input = input
    self.ssnv_based_model = ssnv_based_model
    self.ascna_based_model = ascna_based_model
    # useful outputs
    self.SSNVs = input.candidates
    self.joint_log_likelihood = np.zeros([self.input.resolution, 1])
    self.joint_posterior = np.zeros([self.input.resolution, 1])
    self.CI_tin_high = []
    self.CI_tin_low = []
    self.TiN = []
    self.p_null = 1
    # variables
    self.TiN_range = np.linspace(0, 1, num=self.input.resolution)
    self.TiN_int = 0
    # threshold for accepting variants based on the predicted somatic assignment
    # if p(S|TiN) exceeds threshold we keep the variant.
    self.threshold = 0.5
    # defines whether to remove events based on predicted exceeding predicted allele fractions
    # if Beta_cdf(predicted_normal_af;n_alt_count+1,n_ref_count+1) <= 0.01 we remove the variant
    self.use_outlier_threshold = input.use_outlier_removal
    if self.input.indel_file != 'None':
        if self.input.indel_table.isnull().values.sum() == 0:
            self.indels = self.input.indel_table`

Sorry for the bug. Im not able to edit the main repo but if you made this edit you wouldn't have the trouble. In the meantime Ill communicate with the lab and see who is maintaining the code to insert the fix.

lDesiree commented 4 years ago

This worked, thank you very much for looking into the issue!