Open lsoldini opened 2 years ago
So you have the following data sets:
Are the chromosomes of (b) and (c) the same? Did you take a look at https://pggb.readthedocs.io/en/latest/rst/tutorials/divergence_estimation.html in order to measure the actual sequence divergence?
Sorry, it was not clear. There is population A and B, and there is basically no divergence between them, except for one chromosome (say population A has version AX and population B has version BX of chromosome X). I have:
Chromosome AX and BX have quite diverged because of loss of recombination and large inversions.
I want to build a graph over their whole genome, for further use in vg toolkit
. For doing so, I'd like to use two assemblies in which all chromosomes are the same, except one. Would this be a problem for pggb
? The sequence divergence being tuned for the one chromosome that is different.
Hello,
I am planning to do some transcriptomic analysis using
vg mpmap
andrpvg
. To get there, I want to first build a .gfa graph withpggb
.The particularity is that I have one chromosome-level haplotype-resolved assembly as well as another same-quality assembly but of only one of the chromosome -i.e., one assembly of X chromosomes and another of only 1 chromosome (that chromosome has large inversions of several Mbs whereas all other chromosomes are very similar/identical).
How would you build a graph with such data ?
I was thinking I could complete the partial assembly with the chromosomes from the other, and then run
pggb
on the two 'full' assemblies, with the assumption that all identical reigons would be recognized and collapsed. Or, alternatively, should I do different runs ofpggb
(one with all except the divergent chromosome, and the other with only that chromosome) and later merge the .gfa ?Edit: the name is because I am wondering whether having most regions with 100% identity and one chromosome with quite lower value would be an issue.