emmanuelparadis / pegas

Population and Evolutionary Genetics Analysis System
GNU General Public License v2.0
27 stars 10 forks source link

Positioning of haplotypes in network different everytime the plot is re-done #82

Closed naurasd closed 9 months ago

naurasd commented 11 months ago

Hi

everytime I render a plot of a haplotype network, the network looks different and the haplotypes are positioned spots after saving to file. I created a plot with the following command, offsetting the labels:

> xl <- c(-25, 25)
> yl <- c(-25, 25)
> png("motu24_all_frac_haplotype_network.png",type="cairo",units="in",width=10,height=10,res=400)
> xy<-plot(nt, size = sz,pie=R,bg=pal,labels=F,xlim=xl,ylim=yl,show.mutation = 2)
> # Generate new xx values to offset haplotype labels
> x_info<-as.data.frame(cbind(xy$xx,xy$size))
> colnames(x_info)<-c("xx","radius")
> Xoffset <- ifelse(x_info$xx < 0,x_info$xx-(.5*x_info$radius+2), ifelse(x_info$xx>0,x_info$xx+(.5*x_info$radius+2),0))
> text(Xoffset,xy$yy, attr(nt, "labels"))
> dev.off()
RStudioGD 
        2 

grafik

When I re-run this again a minute later with the same code and the same graphics device being used, the plot looks like this:

grafik

This is pretty annyoing, as I am trying to set offsets for the haplotype labels but I can never just try it out, see and improve, because the looks different everytime it is saved, even though the graphics device is the same. My graphics device usually changes a few times during the same session, not sure how it is chosen. But I made sure that this problem exists when the same device is being used. I understand that this is probably an issue of my graphics device and not of pegas? I am on Windows OS using R via RStudio.

I wanted to attach the objects R, pal, sz and nt for you the re-produce, the issue, but these are file types are not permitted. If you want, I can send them to you via e-mail.

Thanks Nauras

emmanuelparadis commented 11 months ago

Have you tried with pdf() instead of png()?

naurasd commented 11 months ago

pdf() doesn't change anything, unfortunately. networks are in completely different orientation, at least when i re-start the R session and plot again.

emmanuelparadis commented 11 months ago

I suggest: First, you try your code outside RStudio (maybe you did already). Second, if this doesn't (didn't) work, you can send me your files (my address is in the DESCRIPTION file).

naurasd commented 11 months ago

yes, I have tried running the script from command line too, with png() and pdf(). everytime I re-run the script, the networks nodes are placed in completely different locations. I will send you my files to reproduce the problem.

emmanuelparadis commented 10 months ago

The reason of your troubles comes from one feature of rmst(). This function randomizes the order of the haplotypes before computing an MST (which is replicated many times). The last MST is taken as a "backbone" for the network with n - 1 links and additional (alternative) links are defined "on top" of it. So even if you set B = 1 in rmst() the results will not be the same if you do it twice. Here's an example you can do to see clearly the issue:

R> h <- haplotype(motu6,labels=labels(motu6))
R> d <- dist.dna(h, "N")
R> nt <- rmst(d)
Iteration: 5   Number of links: 10
R> nt2 <- rmst(d)
Iteration: 5   Number of links: 10
R> layout(matrix(1:2, 1))
R> plot(nt)
R> plot(nt2)

And the 2 networks are the same:

R> all.equal(nt, nt2)
[1] TRUE

I can see that it's annoying because the same network will have different layouts (because the observations are renumbered at each randomization so attr(, "labels") are reordered between nt and nt2.

I'll change the code to have always the labels numbered like in the input data, so this will make possible to have the same layout using plot(, xy = xy).

emmanuelparadis commented 9 months ago

rmst() has been fixed: the returned network has now a "backbone-MST" built by examining all possible links between two haplotypes in the order they appear in the input distance matrix: if the link was observed during the iterations of the RMST procedure, then it is included in the backbone. This is repeated until an MST is built. The other links are output as alternative links.

This gives generally similar looking networks, although there could be some small variation if the number of repetitions is small.

The new version has just been pushed here.