Open jonahcullen opened 3 months ago
dna-brnn was run with its default settings. From the HPRC paper https://www.nature.com/articles/s41586-023-05896-x#Sec120
SD annotation
SDs were annotated using sedef85 after masking repeats in each assembly. Repeats annotated with more than 20 copies corresponded to unannotated mobile elements and were excluded from the analysis. The pipeline for annotating SDs is available at GitHub (https://github.com/ChaissonLab/SegDupAnnotation/releases/tag/vHPRC).
I think the segdupe data may live here
Hello, I am trying to replicate the generation of the Year1 centromeric satellite and segmental duplication annotations as described for usage with the
contig.inclusion.stats.R
. If I understand correctly, the centromeric annotations are produced withdna-brnn
as part ofcactus-preprocess
(--maskMode brnn
). What mask action was chosen? I am guessing I am just missing it due to my unfamiliarity, but where/how are the segmental duplications marked in thesedef.bedpe
files? Is that withsedef
or nowbiser
? And what if anything was done followingsedef
/biser
(?) to generate for exampleHG00438.maternal.sedef.bedpe
.Thanks for your time, Jonah.