BrooksLabUCSC / flair

Full-Length Alternative Isoform analysis of RNA
Other
205 stars 71 forks source link

gtf_to_bed and bed_to_sequence commands not in the conda environment #232

Closed Shruti-BioCode closed 1 year ago

Shruti-BioCode commented 1 year ago

Command

flair quantify --quality $MAPQ --reads_manifest $manifest_file --isoforms $collapsed_fasta_file -o ${quantification_outfile_prefix}_09_flair_stringent_count_matrix -t $NCPUS --tpm --trust_ends --stringent --check_splice --isoform_bed ${collapsed_bed_file}

How did you install Flair?

  1. bioconda conda create --prefix /hpcnfs/data/cgb/conda_envs/nanopore_analysis_py3tools flair -c conda-forge -c bioconda flair

What happened? I had run the flair collapse command using a previous version v1.5.0 I wanted to use the new feature for quantification in the version v1.7 and I get the error. I am not able to find either of the command gtf_to_bed or bed_to_sequence in the environment.

^@ERROR, the transcript names in the annotation fasta do not appear to match the ones
in the isoforms file. You may be able to fix this by using gtf_to_bed and bed_to_sequence on your annotation gtf
and using the resulting file as your annotation fasta input to this program
Jeltje commented 1 year ago

Thanks for reporting this, I will fix this in the next release. You should be able to find the programs in your conda path, something like this:

.conda/envs/flair/lib/python3.10/site-packages/flair/gtf_to_bed.py
.conda/envs/flair/lib/python3.10/site-packages/flair/bed_to_sequence.py

However, this error typically occurs when you use an external annotation fasta file such as ENSEMBL transcripts. It seems in this case you are using the collapsed isoforms generated by Flair itself.

If the suggested fix doesn't solve the problem please let us know the value of $MAPQ and we'll run some tests.

Shruti-BioCode commented 1 year ago

Hi, Thanks for the reply. The fasta and the bed file was created using the program in old flair version. I was able to use the gtf_to_bed.py from the path location. For the fasta, I realised that the name had an additional _chr**, which I removed it using the awk command.

Jeltje commented 1 year ago

Great, thanks for letting us know. Closing this ticket.