cafelton / FLAIR-fusion

1 stars 2 forks source link

Key Error:'GL000250.2' #4

Open RDorney opened 1 year ago

RDorney commented 1 year ago

Hello, I've run into an error message when running FLAIR-Fusion on my data and have received the following error message

identifying breakpoint sequence - making sam file
b''
identifying breakpoint sequence - searching sam file
checking multi-mapping distance - searching sam file
num pre final filter 232
Traceback (most recent call last):
  File "/data2/ryleyd/FLAIR-fusion/19-03-2021-fasta-to-fusions-pipe.py", line 681, in <module>
    leftSS1 = binarySearch(junctions[loc.split('-')[-1]], temp[0][int(len(temp[0])/2)])
KeyError: 'GL000250.2'

this indicates GL000250.2 is not found in the junctions dictionary, however it is present in the GTF file I'm using. I am not sure how to resolve this error.

 grep GL000250.2 ~/reference_files/FLAIR-Fusion_extra_reference_files/gencode.v43.chr_patch_hapl_scaff.annotation-short.gtf |head
GL000250.2      HAVANA  gene    49218   49589   .       -       .       gene_id "ENSG00000272603.1"; gene_type "processed_pseudogene"; gene_name "NOP56P1"; level 2; hgnc_id "HGNC:13962"; havana_gene "OTTHUMG00000148686.1";
GL000250.2      HAVANA  gene    103454  104591  .       -       .       gene_id "ENSG00000272793.1"; gene_type "lncRNA"; gene_name "ENSG00000272793"; level 2; havana_gene "OTTHUMG00000148688.1";
GL000250.2      HAVANA  gene    125209  128934  .       -       .       gene_id "ENSG00000272857.1"; gene_type "lncRNA"; gene_name "LINC01623"; level 2; hgnc_id "HGNC:52050"; havana_gene "OTTHUMG00000148687.1";
GL000250.2      HAVANA  gene    127000  127631  .       +       .       gene_id "ENSG00000272613.1"; gene_type "processed_pseudogene"; gene_name "RPL13P"; level 2; hgnc_id "HGNC:13978"; havana_gene "OTTHUMG00000148690.1";
GL000250.2      HAVANA  gene    154416  155490  .       -       .       gene_id "ENSG00000236614.1"; gene_type "processed_pseudogene"; gene_name "ZNF90P2"; level 2; hgnc_id "HGNC:21687"; havana_gene "OTTHUMG00000148689.1";
GL000250.2      HAVANA  gene    162114  162906  .       +       .       gene_id "ENSG00000234411.1"; gene_type "lncRNA"; gene_name "HCG14"; level 2; hgnc_id "HGNC:18323"; havana_gene "OTTHUMG00000148692.1";
GL000250.2      HAVANA  gene    168586  189574  .       -       .       gene_id "ENSG00000234495.7"; gene_type "protein_coding"; gene_name "TRIM27"; level 2; hgnc_id "HGNC:9975"; havana_gene "OTTHUMG00000148849.1";
GL000250.2      ENSEMBL gene    181230  181329  .       +       .       gene_id "ENSG00000264025.1"; gene_type "snRNA"; gene_name "U6"; level 3;
GL000250.2      HAVANA  gene    251788  253069  .       +       .       gene_id "ENSG00000224642.2"; gene_type "lncRNA"; gene_name "HCG15"; level 2; hgnc_id "HGNC:18361"; tag "overlapping_locus"; havana_gene "OTTHUMG00000148693.1";
GL000250.2      HAVANA  gene    252385  254160  .       +       .       gene_id "ENSG00000241408.1"; gene_type "lncRNA"; gene_name "ENSG00000241408"; level 2; tag "overlapping_locus"; havana_gene "OTTHUMG00000148691.1";
RDorney commented 1 year ago

I added in this code for FLAIR Fusion to continue working after this error message, however I am not sure how to resolve the original error:

if loc.split('-')[-1] in junctions:
    leftSS1 = binarySearch(junctions[loc.split('-')[-1]], temp[0][int(len(temp[0])/2)])
else:
    print(f"Key {loc.split('-')[-1]} not found in junctions dictionary")
cafelton commented 1 year ago

Hi,

I'm sorry that I haven't been responding rapidly to these issues. I'm working on FLAIR-fusion V2 right now which is a pretty complete redesign and makes it hard to spend time on the old version. I'll have time to look more into your issues in early May though and I do appreciate the feedback.

Best, Colette Felton Brooks Lab Biomolecular Engineering and Bioinformatics UCSC

On Tue, Apr 18, 2023 at 11:09 PM RDorney @.***> wrote:

I added in this code for FLAIR Fusion to continue working after this error message, however I am not sure how to resolve the original error:

if loc.split('-')[-1] in junctions: leftSS1 = binarySearch(junctions[loc.split('-')[-1]], temp[0][int(len(temp[0])/2)])else: print(f"Key {loc.split('-')[-1]} not found in junctions dictionary")

— Reply to this email directly, view it on GitHub https://github.com/cafelton/FLAIR-fusion/issues/4#issuecomment-1514182144, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANP4SJR2PQRBYXX7E77Z5LLXB56S7ANCNFSM6AAAAAAXDRUV54 . You are receiving this because you are subscribed to this thread.Message ID: @.***>