BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
205 stars 71 forks source link

annotation_reliant index out of range #205

Closed mintstella0419 closed 2 years ago

mintstella0419 commented 2 years ago

I ran the CMD:

python FLAIR/flair.py collapse \
    -t 12 \
    -r all_pacbio_samples_chr22.fastq \
    -q all_pacbio_samples_chr22.psl \
    -g GRCh38_no_alt_analysis_chr22.fa \
    -p gencode_v39_promoters_adjusted.bed \
    -f gencode.v39.annotation_chr22.gtf \
    --stringent \
    --annotation_reliant generate \
    -o collapse_chr22_annotation

and got the following error:

Filtering out reads without promoter-supported TSS
Making transcript fasta using annotated gtf and genome sequence
Aligning reads to reference transcripts
Counting supporting reads for annotated transcripts
Setting up unassigned reads for flair-collapse novel isoform detection
Annotated ends extracted from GTF
Read data extracted
Single-exon genes grouped, collapsing
Renaming isoforms using gtf
Aligning reads to first-pass isoform reference
Filtering isoforms by read coverage
Traceback (most recent call last):
  File "FLAIR/bin/filter_collapsed_isoforms_from_annotation.py", line 167, in <module>
    chrom, name, sizes, starts = get_info(line, isbed)
  File "FLAIR/bin/filter_collapsed_isoforms_from_annotation.py", line 66, in get_info
    chrom, name = line[13], line[9]
IndexError: list index out of range

What seems to be the issue here is that it looks like the file extension for args.o+'annotated_transcripts.supported'+ext is psl since my query file has psl extension, but my annotated_bed file has bed extension. Since match_counts.py uses annoated_bed file to generate args.o+'annotated_transcripts.supported'+ext, it should be bed file instead of psl file.

Is this a known issue?