Closed amitjavilaventura closed 2 years ago
Hi Adria,
Thank you for your interest in the software.
It is possible to use genomes other than those we generated, though it would require some setup.
You will need the following annotation files (stored in the annotation
folder):
TE.bed
), typically generated from RepeatMasker or other repetitive sequence findermiRNA.bed
) and premiRNA (named hairpin.bed
), typically generated from miRBaseexon.bed
) and introns (named intron.bed
), typically generated from RefSeq or other genic annotationsstructural.bed
), such as tRNA, snoRNA, snRNA, typically generated from RepeatMaskerpiRNA_cluster.bed
)These files can be empty (it would mean that nothing would be annotated to those categories), but they need to exist.
Other files that are required (stored in the sequence
folder):
genome.fa
), and .fai
(generated with samtools
) for the genomic sequence. Please ensure that the chromosome names match the nomenclature in the annotation (e.g. chr1
in both, not chr1
and 1
)rDNA.fa
) and .fai
for rDNA sequences, which you can get from SILVA, or extracted from the genome if you have a full list of ribosomal DNA locationbowtie_index
An example of the folder structure can be seen here, using dm6 as an example.
Once generated, the folder (e.g. dm6 in our example) should either be placed in the folder where other references are stored (default: $HOME/TEsmall_db/
), or provided at run-time using the --dbfolder
parameter (e.g. `--db_folder /path/to/dm6).
Please feel free to contact us if you encounter any issues, and we can try to help.
Thanks.
Hello,
I am working with small RNAs in species that are not in the list of supported organisms. I was wondering whether it is possible to use genomes other than those speciefied in the download site (https://labshare.cshl.edu/shares/mhammelllab/www-data/TEsmall/).
If so, are all the GTF files (i.e., GTFs for hairpins, miRNAs...) required?
Thank you very much.
Best regards, Adrià.