parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
87 stars 19 forks source link

Mosaic calling - missing parameter? #78

Closed mukamel-lab closed 1 year ago

mukamel-lab commented 1 year ago

When I run xtea with the --mosaic , I get the following error. It looks like there might be something missing in the function get_AF_cutoff?

Thanks for your help. Eran

Traceback (most recent call last): File "/home/emukamel/WGS_Line1/xtea_github/xTea/xtea/x_TEA_main.py", line 892, in xpf_mosic.run_call_mosaic(sf_xtea_rslt, sf_rmsk, i_min_copy_len, i_rep_type, sf_black_list, sf_new_out) File "/tuba/datasets/CZI_human_diversity/WGS_Line1/xtea_github/xTea/xtea/x_mosaic_calling.py", line 44, in run_call_mosaic sf_new_out_bf_black_list) File "/tuba/datasets/CZI_human_diversity/WGS_Line1/xtea_github/xTea/xtea/x_mosaic_calling.py", line 66, in call_mosaic_L1_from_bulk if af_filter.is_qualified_mosaic_rcd(rcd[-1], m_cutoff) == False: File "/tuba/datasets/CZI_human_diversity/WGS_Line1/xtea_github/xTea/xtea/x_post_filter.py", line 1164, in is_qualified_mosaic_rcd b_pass = self.is_ins_pass_mosaic_cutoff(m_cutoff, s_type_ins, f_ef_clip, f_ef_disc, f_clip_full_map, f_disc_concod) File "/tuba/datasets/CZI_human_diversity/WGS_Line1/xtea_github/xTea/xtea/x_post_filter.py", line 1176, in is_ins_pass_mosaic_cutoff (f_upper_af, f_lower_af) = m_cutoff[s_type] KeyError: 'orphan_or_sibling_transduction'

simoncchu commented 1 year ago

Hi Eran, The mosaic mode is not officially released in the current version. It was originally designed to call mosaic L1s in high depth WGS (e.g. 200-300X coverage data). The main difference is setting very low cutoff instead of adjusting the parameters automatically based on the depth like in germline calling. When I tested this module, it reports lots of FPs, and I don't have a good benchmark (because rare) thus didn't export the module. I can add an option in the next release. If you want to call mosaic ones on the current release, you can use the --user option (without --mosaic), then set the user customized cutoff e.g. --nclip 2, --cr 0, and --nd 2. But it will run much longer time and report many false positives.

mukamel-lab commented 1 year ago

Thanks for the clarification!