Closed samuelruizperez closed 1 year ago
Hello Samuel,
Thanks for sharing your experience with this. I am not sure why the rtracklayer import function does not append a ".1" in your case. I just imported RNAcentral mus_musculus.GRCm39.gff3.gz annotations and I get the .1. In any case, I believe the way you solved the problem by renaming the columns is totally fine.
Indeed, the script for integrating annotations depends on "igraph". The library was not loaded at the beginning since the required function is called with the "::" namespace syntax on the script. However, igraph needs to be installed. I will add the library import in the next update as this will be more clear. Thanks for pointing it out.
Hello!
Thank you for developing this tool. 😀
I have been trying to use the integrate_gtf_annotations.R script to unify the most recent annotations for the mouse genome:
https://ftp.ensembl.org/pub/current_gtf/mus_musculus/Mus_musculus.GRCm39.108.gtf.gz https://mirbase.org/ftp/CURRENT/genomes/mmu.gff3 https://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/genome_coordinates/gff3/mus_musculus.GRCm39.gff3.gz
But I got this error:
I believe the issue comes from the following lines of integrate_gtf_annotations.R, because
transcripts@ranges@width
is empty andround(median(transcripts@ranges@width)/2)
is therefore not a valid integer:The reason
transcripts@ranges@width
(which comes fromgff.pi
) ends up empty is becausegff.rcent
does not have a column named"type.1"
when it is subset:gff.pi <- gff.rcent[gff.rcent@elementMetadata$type.1 == "piRNA",]
.gff.rcent
instead has two columns with the duplicated name"type"
.print(gff.rcent)
output:So I guess the root of the problem lies in the way the RNAcentral database is imported:
gff.rcent <- rtracklayer::import(paste0(db.dir,"rnacentral_mus_musculus.GRCm39.gff3.gz"))
It should be making the column names of the
IRanges
object unique by appending.1
to the second instance, but it is not doing so. I don't know much about how these functions work, so I fixed it for my purposes by adding this line after the import:There is probably a better or more elegant solution, though. It could also be an issue with my
R
orR
packages versions, but I am not sure.Finally, after adding that line to the script, I encountered another error:
So, maybe
igraph
should be added to the requirements at the beginning of the script.Hope this helps.
Thank you, Samuel
Session details: