BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
203 stars 69 forks source link

Error while running collapse #267

Closed kundariyahardik closed 10 months ago

kundariyahardik commented 12 months ago

*Copy and paste the exact command you tried to run**

flair-master/bin/flair collapse -g TAIR10_chr_all.fa -q Chr1_all_corrected.bed \ -r vegetable.flw_1.subreads.fastq.gz,vegetable.flw_2.subreads.fastq.gz \ --output flw --gtf Arabidopsis_thaliana.TAIR10.38.gtf \ --threads 6 --generate_map --annotation_reliant generate

How did you install Flair? (We'd prefer it if you used one of the top two because they are the least likely to have package compatibility problems.)

  1. downloaded latest release from github

What happened? I am getting following error while running collapse

Starting collapse...
Writing temporary files to /tmp/tmpxuyylh47/
Making transcript fasta using annotated gtf and genome sequence
Traceback (most recent call last):
  File "/data5/hsk13/IsoSeq/flair-master/src/flair/gtf_to_bed.py", line 108, in <module>
    if blockcount > 1 and blockstarts[0] > blockstarts[1]:  # need to reverse exons
IndexError: list index out of range
Traceback (most recent call last):
  File "/data5/hsk13/IsoSeq/flair-master/bin/flair", line 1126, in <module>
    main()
  File "/data5/hsk13/IsoSeq/flair-master/bin/flair", line 1047, in main
    status = collapse()
  File "/data5/hsk13/IsoSeq/flair-master/bin/flair", line 523, in collapse
    subprocess.check_call([sys.executable, path+'gtf_to_bed.py', args.f, args.annotated_bed, '--include_gene'])
  File "/data5/hsk13/anaconda3/envs/py3.10/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/data5/hsk13/anaconda3/envs/py3.10/bin/python3', '/data5/hsk13/IsoSeq/flair-master/src/flair/gtf_to_bed.py', '/data5/hsk13/IsoSeq/Arabidopsis_thaliana.TAIR10.38.gtf', 'flw.annotated_transcripts.bed', '--include_gene']' returned non-zero exit status 1.

What else do we need to know? I tried running gtf_to_bed.py using following command: /data5/hsk13/IsoSeq/flair-master/src/flair/gtf_to_bed.py /data5/hsk13/IsoSeq/Arabidopsis_thaliana.TAIR10.38.gtf TAIR10_38_gtf.bed

Gives following error: Traceback (most recent call last): File "/data5/hsk13/IsoSeq/flair-master/src/flair/gtf_to_bed.py", line 108, in if blockcount > 1 and blockstarts[0] > blockstarts[1]: # need to reverse exons IndexError: list index out of range

Jeltje commented 10 months ago

Could your gtf file be corrupted? Do the last lines look OK? This error comes from the final gene.

It looks like your gtf was downloaded from Ensembl. I can't find version 38 anymore, but I downloaded the current file (57) and do not get your error.

Jeltje commented 10 months ago

Closing because no response; please reopen if your problem has not been resolved.

byee4 commented 1 month ago

I hit this error too and could use some help. In order to process my dataset, I split up the data per chromosome as recommended, but it seems to dislike the last gene in mm39 chr12 (Gencode V35). Below is my command, also I've attached the gzipped GTF file: python /home/bay001/software/miniconda_tscc2/envs/flair-2.0.0/lib/python3.8/site-packages/flair/gtf_to_bed.py reference.chr12.gtf expt.isoforms.chr12.annotated_transcripts.bed --include_gene

reference.chr12.gtf.gz