millanek / Dsuite

Fast calculation of Patterson's D (ABBA-BABA) and the f4-ratio statistics across many populations/species
160 stars 26 forks source link

Fbranch all 0 entries using v0.4 #32

Closed jessicarick closed 3 years ago

jessicarick commented 3 years ago

First of all, thanks for this useful and easy-to-use package!

I'm trying to calculate Fbranch statistics (using version 0.4, downloaded today) on a small tree with many significant D-statistics from Dtrios. When I run Fbranch, all of the entries are either 0 or nan. I know that nan's are comparisons that aren't applicable given my tree, but I am surprised that all of the rest of the entries are 0.

When I had previously run Fbranch using version 0.2 (I believe), several of the entries were non-zero, even with the same *_tree.txt input file and same tree.nwk tree. I am unsure about why the difference in version would change the results, even from the same Dtrios output-- and why my results from the new Fbranch are all zeroes. I also tried changing the p-value threshold, but all of the entries remain as 0s no matter what I try.

My command line for Fbranch is: Dsuite Fbranch tree.nwk sets_021021_tree.txt > sets_021021_fbranch.txt

My tree is: (Outgroup,(Lnil,((Lmar,Lang),(Lmic,Lsta))));

My sets_021021_tree.txt from Dtrios is: P1 P2 P3 Dstatistic p-value f_G Lang Lmar Lmic 0.135053 0 0.19006 Lmar Lang Lnil 0.0984446 0 0.0139214 Lmar Lang Lsta 0.233125 0 0.094233 Lmic Lang Lnil 0.109105 0 0.0159413 Lmic Lsta Lang 0.10392 0 0.13288 Lsta Lang Lnil 0.0922383 0 0.0136548 Lmic Lmar Lnil 0.0138281 7.77156e-16 0.00204724 Lsta Lmic Lmar 0.227573 0 0.248475 Lmar Lsta Lnil 0.00165132 0.312065 0.000271264 Lmic Lsta Lnil 0.0141116 1.49433e-08 0.00231811

And then the output from Fbranch is: branch branch_descendants Outgroup Lnil Lmar Lang Lmic Lsta b2 Lnil nan nan nan nan nan nan b3 Lmar,Lang,Lmic,Lsta nan nan nan nan nan nan b4 Lmar,Lang nan 0 nan nan nan nan b5 Lmic,Lsta nan 0 nan nan nan nan b6 Lmar nan 0 nan nan 0 0 b7 Lang nan 0 nan nan 0 0 b8 Lmic nan 0 0 0 nan nan b9 Lsta nan 0 0 0 nan nan

Thanks in advance for help with this.

Best, Jessica

JimWhiting91 commented 3 years ago

Hi Jessica, I don't know if this is still helpful but thought I'd reply as I was having the same problem and came across this issue, and this may be useful for others. I didn't have all 0s in my fbranch results, but they definitely didn't match up with my _tree.txt output and only one or two cells had very small non-zero values.

I've since run through it again and my results look a lot better now, so for me the solution can only have been one of a few things. My original _tree.txt came from an earlier version of dsuite (I'm not sure which, but pre-Dquartets added in), so it may have been version conflict between my _tree.txt file being made and running Fbranch with 0.4?

Alternatively, I also updated my newick tree to include a rooted Outgroup to make the original _tree.txt file this time around, whereas last time I did not have an outgroup in my tree until I came to try and run Fbranch. Apart from that, I've used the same tree structure, same VCF, same code.

Cheers, Jim

jessicarick commented 3 years ago

Hi Jim,

Thanks for the input. I think I also have it solved based on your recommendations, although it's unclear what the problem/solution was. I think that including the rooted outgroup in my tree, and then re-running everything from the beginning using 0.4 was the combination of things that worked, just like you suggested.

Thanks, Jessica

millanek commented 3 years ago

Hi Jessica, Jim

Indeed, it makes sense to ensure you use the same version of Dsuite and the same input files throughout the analysis. Otherwise the behaviour is unpredictable ;).

Best wishes Milan