parklab / xTea

Comprehensive TE insertion identification with WGS/WES data from multiple sequencing technics
Other
87 stars 19 forks source link

Failed to retrieve block: unexpected end of file #68

Closed mhguo1 closed 1 year ago

mhguo1 commented 1 year ago

When I try and run xTea on LINE1s for short read WGS, the program runs for a while, and then I get the error below. I have tried running on other TE types on my samples and get a similar error. It does appear that /working_dir/str_test/fxs/te/test/135/L1/tmp/cns/candidate_sites_all_disc.fa does not exist, but I'm not sure how to fix this...

[main] Version: 0.7.17-r1198-dirty [main] CMD: bwa mem -t 1 -o /working_dir/str_test/fxs/te/test/135/L1/tmp/cns/temp_disc.sam /working_dir/software/xTea/annot/ref/consensus/LINE1.fa /working_dir/str_test/fxs/te/test/135/L1/tmp/cns/candidate_sites_all_disc.fa [main] Real time: 24.528 sec; CPU: 24.439 sec [E::fai_retrieve] Failed to retrieve block: unexpected end of file Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 599, in xtransduction.call_candidate_transduction_v3(sf_tmp_slct, sf_candidate_list, x_cd_filter, File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_transduction.py", line 189, in call_candidate_transduction_v3 self.construct_novel_flanks(l_fl_polymorpic, sf_flank, i_flank_length, sf_flank_with_poly) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_transduction.py", line 662, in construct_novel_flanks xref.gnrt_flank_regions_of_polymerphic_insertions(l_fl_polymorphic, i_flank_lenth, self.sf_reference, File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_reference.py", line 272, in gnrt_flank_regions_of_polymerphic_insertions s_left_region = f_fa.fetch(ref_chrm, istart, ins_pos) File "pysam/libcfaidx.pyx", line 317, in pysam.libcfaidx.FastaFile.fetch OSError: [Errno 25] b'Inappropriate ioctl for device' Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 695, in xorphan.re_slct_with_clip_raw_disc_sites(sf_raw_disc, sf_output_tmp, n_disc_cutoff, xannotation, File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_transduction.py", line 54, in re_slct_with_clip_raw_disc_sites m_te_candidates=xintmdt.load_in_candidate_list_str_version(sf_candidates) #TE candidates File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_intermediate_sites.py", line 184, in load_in_candidate_list_str_version with open(sf_candidate_list) as fin_candidate_sites: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns2.txt.all_non_sibling_td.txt' Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 896, in xpost_filter.run_post_filtering(sf_xtea_rslt, sf_rmsk, i_min_copy_len, i_rep_type, f_cov, sf_black_list, File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_post_filter.py", line 416, in run_post_filtering l_old_rcd=xtea_parser.load_in_xTEA_rslt(sf_xtea_rslt) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_post_filter.py", line 495, in load_in_xTEA_rslt with open(sf_rslt) as fin_in: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns2.txt' Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 896, in xpost_filter.run_post_filtering(sf_xtea_rslt, sf_rmsk, i_min_copy_len, i_rep_type, f_cov, sf_black_list, File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_post_filter.py", line 416, in run_post_filtering l_old_rcd=xtea_parser.load_in_xTEA_rslt(sf_xtea_rslt) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_post_filter.py", line 495, in load_in_xTEA_rslt with open(sf_rslt) as fin_in: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns2.txt.high_confident' Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 1033, in gff.annotate_results(sf_input, sf_output) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_gene_annotation.py", line 238, in annotate_results with open(sf_ori_rslt) as fin_rslt, open(sf_out, "w") as fout_rslt: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns.txt.high_confident.post_filtering.txt' /working_dir/software/xTea/xtea_venv/lib/python3.9/site-packages/sklearn/base.py:329: UserWarning: Trying to unpickle estimator LabelEncoder from version 1.0.1 when using version 1.1.3. This might lead to breaking code or invalid results. Use at your own risk. For more info please refer to: https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations warnings.warn( Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 932, in gc.predict_for_site(sf_model, sf_xTEA, sf_new) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_genotype_classify.py", line 146, in predict_for_site with open(sf_xTEA) as fin_xTEA, open(sf_new, "w") as fout_new: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene.txt' sort: cannot read: /working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene_gntp.txt: No such file or directory Traceback (most recent call last): File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_TEA_main.py", line 964, in gvcf.cvt_raw_rslt_to_gvcf(s_sample_id, sf_bam, sf_raw_rslt, i_rep_type, sf_ref, sf_vcf) File "/working_dir/software/xTea/xTea-0.1.9/xtea/x_gvcf.py", line 199, in cvt_raw_rslt_to_gvcf with open(sf_raw_rslt_sorted) as fin_rslt: FileNotFoundError: [Errno 2] No such file or directory: '/working_dir/str_test/fxs/te/test/135/L1/candidate_disc_filtered_cns.txt.high_confident.post_filtering_with_gene_gntp.txt.sorted' Usage: x_TEA_main.py [options]

x_TEA_main.py: error: no such option: --bamsnap Usage: x_TEA_main.py [options]

x_TEA_main.py: error: no such option: --bamsnap

simoncchu commented 1 year ago

Could you post the command you run and also the set memory and cores?

mhguo1 commented 1 year ago

Here's the command I ran: /working_dir/software/xTea/xTea-0.1.9/bin/xtea --xtea /working_dir/software/xTea/xTea-0.1.9/xtea -l /working_dir/software/xTea/annot/ref/ -g /working_dir/software/xTea/annot/gencode.v33.primary_assembly.annotation.gff3 -i sample_id.txt -b bam.list -x null -r /working_dir/ref/genomes/hg38.fa.gz -p /working_dir/str_test/fxs/te/test/ -o run_test.sh -f 5907 -y 15 --lsf

And here are the memory and cores:

BSUB -n 1

BSUB -M 20G

BSUB -q None

BSUB -o 135_%J.out

BSUB -e 135_%J.err

BSUB -R "rusage[mem=20.0G]"

Thank you!

simoncchu commented 1 year ago

For -r try with a unzipped version of reference, rather than hg38.fa.gz. You can also add a -n option to run in parallel.

mhguo1 commented 1 year ago

Unzipping the reference fasta file has resolved this error!

Thanks, Michael