CCBR / RENEE

A comprehensive quality-control and quantification RNA-seq pipeline
https://CCBR.github.io/RENEE/
MIT License
4 stars 4 forks source link

Error running `renee build` for marmoset reference genome #161

Closed nmkuehn closed 3 weeks ago

nmkuehn commented 1 month ago

I am attempting to build a marmoset reference genome from NCBI files using renee build.

qualimapinfo fails three times then shuts the job down. The qualimap_error.log reads:

Traceback (most recent call last): File "/data/$USER/MarmosetGenome/workflow/scripts/builder/generate_qualimap_ref.py", line 98, in biotype = exons[0].attr["gene_type"] KeyError: 'gene_type'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/data/$USER/MarmosetGenome/workflow/scripts/builder/generate_qualimap_ref.py", line 100, in biotype = exons[0].attr["gene_biotype"] KeyError: 'gene_biotype'

The fasta and .gtf files were downloaded from NCBI. The .gtf has "gene_biotype" attribute : Example:

NC_071442.1 Gnomon gene 399914 559170 . - . gene_id "LOC118152095"; transcript_id ""; db_xref "GeneID:118152095"; description "glucose-6-phosphate 1-dehydrogenase-like"; gbkey "Gene"; gene "LOC118152095"; gene_biotype "transcribed_pseudogene"; pseudo "true";

kopardev commented 3 weeks ago

We have only tested this for human and mouse genome annotations from ensembl. These could be because of formatting differences/errors in the GTF.