BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
205 stars 71 forks source link

Flair collapse --annotation_reliant doesn't run without --generate_map #222

Closed cafelton closed 1 year ago

cafelton commented 1 year ago

installed flair with pip3, running it from pip installed directory

python3 /private/home/cafelton/.local/lib/python3.6/site-packages/flair/flair.py collapse -g /private/groups/brookslab/reference_sequence/GRCh38.primary_assembly.genome.fa -f /private/groups/brookslab/reference_annotations/gencode.v38.annotation.gtf --annotation_reliant /private/groups/brookslab/reference_sequence/gencode.v38.transcripts.fa --stringent --check_splice -r DRR059313.fastq -q test-new-flair-correct_all_corrected.bed -o test-new-flair-collapse

Writing temporary files to /scratch/tmp/tmprd0pp63r/
Making transcript fasta using annotated gtf and genome sequence Aligning reads to reference transcripts Counting supporting reads for annotated transcripts Setting up unassigned reads for flair-collapse novel isoform detection Annotated ends extracted from GTF Read data extracted Single-exon genes grouped, collapsing Renaming isoforms using gtf Aligning reads to first-pass isoform reference [M::mm_idx_gen::0.0091.29] collected minimizers [M::mm_idx_gen::0.0172.37] sorted minimizers [M::main::0.0172.36] loaded/built the index for 218 target sequence(s) [M::mm_mapopt_update::0.0182.25] mid_occ = 29 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 218 [M::mm_idx_stat::0.0202.17] distinct minimizers: 17186 (81.10% are singletons); average occurrences: 2.443; average spacing: 5.387 [M::worker_pipeline::9.2983.86] mapped 10010 sequences [M::main] Version: 2.14-r894-dirty [M::main] CMD: minimap2 -a -t 4 -N 4 test-new-flair-collapse.firstpass.fa test-new-flair-collapse.unassigned.fasta [M::main] Real time: 9.305 sec; CPU: 35.901 sec; Peak RSS: 0.149 GB Filtering isoforms by read coverage Traceback (most recent call last): File "/private/home/cafelton/.local/lib/python3.6/site-packages/flair/match_counts.py", line 35, in for line in open(args.generate_map): FileNotFoundError: [Errno 2] No such file or directory: 'test-new-flair-collapse.isoform.read.map.txt' Traceback (most recent call last): File "/private/home/cafelton/.local/lib/python3.6/site-packages/flair/flair.py", line 1248, in main() File "/private/home/cafelton/.local/lib/python3.6/site-packages/flair/flair.py", line 1186, in main status = collapse() File "/private/home/cafelton/.local/lib/python3.6/site-packages/flair/flair.py", line 753, in collapse subprocess.check_call(match_count_cmd) File "/usr/lib64/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/usr/bin/python3', '/private/home/cafelton/.local/lib/python3.6/site-packages/flair/match_counts.py', 'test-new-flair-collapse.firstpass.q.counts', 'test-new-flair-collapse.firstpass.bed', '3.0', 'test-new-flair-collapse.isoforms.bed', '--generate_map', 'test-new-flair-collapse.isoform.read.map.txt']' returned non-zero exit status 1.

Basically, it looks like match_counts.py won't run because it is trying to call a file (test-new-flair-collapse.isoform.read.map.txt) that was never created because the --generate_map option wasn't selected. When I add --generate_map to the command, it runs fine. Might want to just update the documentation or automatically cause generate_map to be true if --annotation_reliant is true

Jeltje commented 1 year ago

Fixed and updated documentation. This will be available in releases after 1.6.4