Open Jokendo-collab opened 4 months ago
Template sequence is used for StringDecomposer to split the satellite sequence. You need to make sure the species centromere sequence is satellite type. It's not suitable for other types, like TE-rich or point cen. If it is satellite type, you can try TRF and the pipeline we recent developed for building satellite library. It may help you to infer the centromere satellite. But the exact centromere sequence should be identified based on Chip-seq of CENH3.
In situations where you do not have a monomer template, how do you run this kind of analysis? How confident is the result?