Closed KatherineAr closed 2 years ago
Hi Katherine,
I'm not sure I fully understand this question. What kind of data or file format do you currently have (e.g. sequences or sequence coordinates) for the genotypes, and what kind of annotation or masking are you trying to perform?
Thanks for answering!
My data consists on SNPs with coordinates, 500 000 SNPs aproximately. I'm trying to identify if there are LINES or SINES. I think it can't be because RM doesn't work with coordinates.
I hope you can help me. Thank you so much :)
Since you have coordinates, you may be able to compare the locations of the SNPs to the locations of repeats annotated by RepeatMasker - as long as the locations are all relative to the same reference sequence or if you can convert them. Some tools that can help with this include util/rmOutToGFF3.pl
and bedtools intersect
. Does that approach sound appropriate for your data?
That's what I needed. Thank you so much for your help! :)
Glad to hear it!
What do you want to know?
Helpful context
Is there a particular genome assembly or organism your question is about? If possible, please provide a link to a publicly available assembly and/or a species name.
Have you installed RepBase RepeatMasker Edition for RepeatMasker?