Closed TingQi2020 closed 6 months ago
Hi Ting, it seems like there might be an encoding issue with your input files. Could you check the encoding of the .tab
file mentioned in the error? If the issue persists, could you provide a sample of the file causing the error? This will help diagnose the problem more effectively.
Thanks for your prompt response, Tianyuan. I've checked the .tab file and it looks OK. From the log file pasted above, it seems that SQANTI3 caused an error when it read the .bam file, which was generated by STAR. Attached please find the .tab file for your test. Thank you in advance. JXBJ23BD-0199-chr21_SJ.out.tab.zip
Hi @TingQi2020
The problem is reading the bam file, not the SJ.out.tab. If you only have the Bam file that you want to use in your ${BASE_DIRT}/SR_bam/ directory you can just provide the path without the file name and it will work. Otherwise, you can create a fofn including the full path to the bam file that you want to use, and provide this fofn to the --SR_bam option.
Sorry for the inconvenience and hope this fixes your problem, Alejandro
Is there an existing issue for this?
Have you loaded the SQANTI3.env conda environment?
Problem description
No response
Code sample
/storage/yangjianLab/qiting/software/SQANTI3/sqanti3_qc.py \ ${GTF_NOVEL} ${GTF_REF} ${fasta_file} \ --force_id_ignore \ -o ONT_221samples_stringtie \ -d ${BASE_DIRT}/QC_output \ -c ${BASE_DIRT}/SR_Junction/JXBJ23BD-0199-chr21_SJ.out.tab \ --SR_bam ${BASE_DIRT}/SR_bam/JXBJ23BD-0199-chr21_Aligned.sortedByCoord.out.md.bam \ --skipORF \ --CAGE_peak ${SQANTI_FOLDER}/data/ref_TSS_annotation/human.refTSS_v3.1.hg38.bed \ --polyA_motif_list ${SQANTI_FOLDER}/data/polyA_motifs/mouse_and_human.polyA_motif.txt \ --cpus 4 \ --report both
Error
Input pattern: /storage/yangjianLab/qiting/bulkRNA_LR/01.QC_and_Quantification/1.4.quantification/SQANTI/SR_Junction/JXBJ23BD-0199-chr21_SJ.out.tab. The following files found and to be read as junctions: /storage/yangjianLab/qiting/bulkRNA_LR/01.QC_and_Quantification/1.4.quantification/SQANTI/SR_Junction/JXBJ23BD-0199-chr21_SJ.out.tab 3547 junctions read. 3 junctions added to both strands because no strand information from STAR. Using provided BAM files for calculating TSS ratio Traceback (most recent call last): File "/storage/yangjianLab/qiting/software/SQANTI3/sqanti3_qc.py", line 2572, in
main()
File "/storage/yangjianLab/qiting/software/SQANTI3/sqanti3_qc.py", line 2555, in main
run(args)
File "/storage/yangjianLab/qiting/software/SQANTI3/sqanti3_qc.py", line 1875, in run
isoforms_info, ratio_TSS_dict = isoformClassification(args, isoforms_by_chr, refs_1exon_by_chr, refs_exons_by_chr, junctions_by_chr, junctions_by_gene, start_ends_by_gene, genome_dict, indelsJunc, orfDict, corrGTF)
File "/storage/yangjianLab/qiting/software/SQANTI3/sqanti3_qc.py", line 1533, in isoformClassification
for file in b:
File "/home/yangjianLab/qiting/miniconda3/envs/SQANTI3.env/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
Anything else?
No response