Closed qbilius closed 2 months ago
Hi Jonas
No. it is not possible to link a repeat from repeats.fa to an assembly. The curated repeatTyper model was build from Makarova et al. (2020) and Pinilla-Redondo et al. (2019). But the model has been continuously updated, and repeats.fa has also been updated. It has been supplemented with GTDB genomes, Ensembl genomes, and also from data uploaded to the cctyper webserver as stated in the paper. Data uploaded to the webserver is not saved (only repeat, subtype, and some information to avoid duplicates is saved) so the repeat could be from a non-public source.
/ Jakob
Hi,
Thanks for your great work!
I've been struggling to identify how the
repeats.fa
file was created. Say I wanted to identify the source assembly forBut running a blastn search online with default parameters fails to return any exact matches.
The article states that the sources for
repeats.fa
are Makarova et al. (2020) and Pinilla-Redondo et al. (2019). Since the latter focuses on Type IV systems, I looked up Makarova's data source and they seem to be solely from NCBI, thus blastn should find matching repeats, but it doesn't.Could you perhaps clarify where these repeat sequences came from? Perhaps there is some index file, showing to the organism / assembly that, say,
V-A_862
came from?Thanks for you kind help, Jonas