Open ptrebert opened 2 years ago
@pilleh
Regarding this file GRCh38_chrY-seq-classes_coord_plus_repeats.bed
: if you add more coordinate (BED) files in the future, please try to make sure that they are sorted.
Is it worthwhile to bug the T2T folks about similar annotation (SD and repeat) for the T2T Y?
@ptrebert Sure thing, I'll sort them in the future, thanks. In principle we need to do this annotation anyway, also for T2T, and I have kind of been doing it until now. In most regions it's quite straightforward, but in some of the ampliconic regions it's a bit of a pain as they are quite rearranged. Might be good to discuss this in one of the coming Y meetings.
this has been implemented now for the GRCh38 file.
@ptrebert I'm sorry, but I noticed that one more of these sequence class end/start points (specifically for XDR2/AMPL2) didn't make sense so I modified it slightly. It moved the boundary by ~18kb towards PAR1. This should not affect chrY contig identification. But it would be good to re-run the 'hg38 and T2T Y seq. classes to assembly' steps in case you have those implemented. I've added new versions to the HHU reference folder: T2T.chrY-seq-classes-NEW.bed GRCh38_chrY-seq-classes_coord_plus_repeats_NEW.bed I'm sorry about this. These coordinates have been merged from a few previous publications, which does not simply matters much. I hope I won't have to mess with them again.
This should not affect chrY contig identification
The problem with these types of assumptions is that they might be wrong for certain samples and we won't notice until much later :-) Shifting boundaries may affect contig renaming, though, which implies that everything has to be rerun anyway. I will wait with that until HMMER has been updated (they implemented a fix, but the fix still needs to be merged into their code). Can you "quantify" this in today's call s.t. we can get a feeling for how likely it is that we need more of these sequence class updates in the future?
one more of these sequence class end/start points didn't make sense
Pille: