BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
201 stars 69 forks source link

Issues implementing FLAIR2 #304

Closed Max-Tomlinson closed 6 months ago

Max-Tomlinson commented 9 months ago

I'm trying to run FLAIR for a benchmarking project and would love to make use of the performance increases described for FLAIR2, however, when running with --annotation_reliant, there seems to be an issue for FLAIR collapse matching the names between the fastq file and bed file (but it runs fine without this argument). I aligned data with minimap2 rather than FLAIR align, could this be the issue or is it related to something else?

Thanks, Max

flair collapse \

-g "/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/references/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" \
-q "/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.bed" \
-r "/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.fastq" \
-o "/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/collapse/all_chrY" \
-t 16 \
-f "/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/references/gencode.v44.annotation.gtf" \
--annotation_reliant generate \
-s 10 \
--check_splice \
--trust_ends
Writing temporary files to /tmp/tmp5ywz0mdh/    
Making transcript fasta using annotated gtf and genome sequence
Aligning reads to reference transcripts
Counting supporting reads for annotated transcripts
Setting up unassigned reads for flair-collapse novel isoform detection
49240 names do not match any names in fastq file(s)e.g. 115:541|681f5198-5538-4168-98c2-a3b3acc3727c in bed but not in fastq
Traceback (most recent call last):
  File "/users/k19043774/miniconda3/envs/flair/bin/flair", line 10, in <module>
    sys.exit(main())
  File "/users/k19043774/miniconda3/envs/flair/lib/python3.10/site-packages/flair/flair.py", line 1035, in main
    status = collapse()
  File "/users/k19043774/miniconda3/envs/flair/lib/python3.10/site-packages/flair/flair.py", line 560, in collapse
    subprocess.check_call([sys.executable, path+'subset_unassigned_reads.py', args.o+'annotated_transcripts.isoform.read.map.txt',
  File "/users/k19043774/miniconda3/envs/flair/lib/python3.10/subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/users/k19043774/miniconda3/envs/flair/bin/python', '/users/k19043774/miniconda3/envs/flair/lib/python3.10/site-packages/flair/subset_unassigned_reads.py', '/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/collapse/all_chrY.annotated_transcripts.isoform.read.map.txt', '/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.bed', '10.0', '/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/collapse/all_chrY.unassigned.bed', '/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.fastq']' returned non-zero exit status 1.
Jeltje commented 7 months ago

We're still reworking Flair so it gives better error messages, because this one doesn't tell us what we need to know. Sorry about that. Could you try running the command in the final line:

/users/k19043774/miniconda3/envs/flair/lib/python3.10/site-packages/flair/subset_unassigned_reads.py \
/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/collapse/all_chrY.annotated_transcripts.isoform.read.map.txt  \
/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.bed \
10.0 \
/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/collapse/all_chrY.unassigned.bed \
/scratch/prj/dtr/Groups_WorkSpace/KerrinSmall/Max/Nanopore/FLAIR/all/all_samples_chrY.fastq

What error do you get then?