WangHYLab / fcirc

a python pipeline for linear and circular RNAs of known fusions exploration
MIT License
1 stars 1 forks source link

KeyError: 'AFDN', when I run fcirc with own fusion gene pairs #1

Closed nongbaoting closed 2 years ago

nongbaoting commented 3 years ago

I have successfully run the command below to test where I can use fusion gene pairs on my own.

cd test_data
head ../reference_fusion_info/fusion_table.tsv|tail -5 > fusion_table.tsv
sed  -i -e 's/-/--/g' fusion_table.tsv
python3 ../build_graph.py --genome $hg38 --gtf $hg38_gtf --tab `pwd`/fusion_table.tsv
cd ../fusion_total_index
hisat2-build fusiongenes_ref_U.fa fusiongenes_ref_U
hisat2-build fusiongenes_ref_V.fa fusiongenes_ref_V

and I want to run fcirc.py , but I encounter an error!

python ../fcirc.py  -x /dat1/dat/ref/hg38/hg38+gencode.v32/hisat2_index/hg38  -f ../fusion_total_index -c ../fusion_genes_coordinate.txt  -1 ./test.fastq
2020-08-08 11:09:24,807 fcirc.py[line:452] INFO [2020-08-08 11:09:24] Start running # ../fcirc.py -x /dat1/dat/ref/hg38/hg38+gencode.v32/hisat2_index/hg38 -f ../fusion_total_index -c ../fusion_genes_coordinate.txt -1 ./test.fastq
[2020-08-08 11:09:29] Finish mapping reads to transcription!
[2020-08-08 11:09:29] Finish mapping reads to fusion references U!
[2020-08-08 11:09:29] Finish mapping reads to fusion references V!
[2020-08-08 11:09:30] Finish dropping unmapped read in fusion references U and V!
Traceback (most recent call last):
  File "../fcirc.py", line 453, in <module>
    main(sys.argv[1:])
  File "../fcirc.py", line 239, in main
    mapped_filtered_samU_path, mapped_filtered_samV_path = getpairedreads(mapped_samU_path, mapped_samV_path, pattern, fusion_idx_dir_path)
  File "/home/nbt2/proj/pipetest/fcirc-2/fcirc/getpairedreads.py", line 67, in getpairedreads
    partners = fusion_dic[read.reference_name]
KeyError: 'AFDN'
nongbaoting commented 3 years ago

I have figure out the bug.

just change the fusion_table.tsv in the final reference! sed -i -e 's/--/-/g' fusion_total_index/fusion_table.tsv

gnilihzeux commented 3 years ago

Eh, I'd suffered this with python3 build_graph.py too. So the genes in the gtf with "-" should been changed to other sign such as "_".

zhixue commented 3 years ago

I have figure out the bug.

just change the fusion_table.tsv in the final reference! sed -i -e 's/--/-/g' fusion_total_index/fusion_table.tsv

Thank you for using fcirc.

We found that '-' exists in some human gene names, such as 'AFDN-DT', 'HLA-A', 'HLA-B'. So we used '--' to link the fusion pair of two genes, as same as STAR-Fusion does.