PacificBiosciences / pbbioconda

PacBio Secondary Analysis Tools on Bioconda. Contains list of PacBio packages available via conda.
BSD 3-Clause Clear License
249 stars 44 forks source link

Iso-Seq Pigeon classify GTF error #698

Closed akulan1 closed 3 months ago

akulan1 commented 3 months ago

Hi,

I am using GTF provided by PacBio (under reference files) to classify the transcripts but getting the following error. Below is the command and error. Any suggestions are greatly appreciated.

pigeon classify sample.hifi_reads_collapsed.sorted.gff /ensembleGrch38/REF-pigeon_ref_sets/Human_hg38_Genecode_v39/gencode.v39.annotation.sorted.gtf /ensembleGrch38/Homo_sapiens.GRCh38.dna.primary_assembly.fa --fl sample.hifi_reads_collapsed.flnc_count.txt | 20240625 13:01:15.675 | FATAL | pigeon classify ERROR: error loading reference annotations for reference: 1

Thank you very much. Regards, Nirmala

jmattick commented 3 months ago

Hi @akulan1, You can use pigeon prepare to sort+index (also validate) any input files. Otherwise, make sure you have all required index files for pigeon.

# download references
$ wget https://downloads.pacbcloud.com/public/dataset/MAS-Seq/REF-pigeon_ref_sets/Human_hg38_Gencode_v39/gencode.v39.annotation.sorted.gtf
$ wget https://downloads.pacbcloud.com/public/dataset/MAS-Seq/REF-pigeon_ref_sets/Human_hg38_Gencode_v39/gencode.v39.annotation.sorted.gtf.pgi
$ wget https://downloads.pacbcloud.com/public/dataset/MAS-Seq/REF-pigeon_ref_sets/Human_hg38_Gencode_v39/human_GRCh38_no_alt_analysis_set.fasta
$ wget https://downloads.pacbcloud.com/public/dataset/MAS-Seq/REF-pigeon_ref_sets/Human_hg38_Gencode_v39/human_GRCh38_no_alt_analysis_set.fasta.fai

# download collapse gff
$ wget https://downloads.pacbcloud.com/public/dataset/Kinnex-full-length-RNA/DATA-SQ2-UHRR-Monomer/4-Collapse/collapse_isoforms.gff

# prepare input file
$ pigeon prepare collapse_isoforms.gff
$ ls
collapse_isoforms.gff         collapse_isoforms.sorted.gff.pgi   gencode.v39.annotation.sorted.gtf.pgi   human_GRCh38_no_alt_analysis_set.fasta.fai
collapse_isoforms.sorted.gff  gencode.v39.annotation.sorted.gtf  human_GRCh38_no_alt_analysis_set.fasta

# run pigeon classify
$ pigeon classify collapse_isoforms.sorted.gff gencode.v39.annotation.sorted.gtf human_GRCh38_no_alt_analysis_set.fasta -o out