heche-psb / wgd

wgd v2: a suite of tools to uncover and date ancient polyploidy and whole-genome duplication
https://wgdv2.readthedocs.io/en/latest/
GNU General Public License v3.0
21 stars 0 forks source link

ksd not finishing #39

Open EmilyPhelps opened 1 week ago

EmilyPhelps commented 1 week ago

Hello,

I've been trying to use wgd V2 to calculate a Ks peak

I've taken a subset of my CDS (~3K sequences)

When I run

 wgd dmd subset.fasta 

I get: Screenshot 2024-06-25 at 11 23 09 But it finishes. So if I run:

wgd ksd wgd_dmd/subset.fasta.tsv subset.fasta

It will start off okay and just slow to a stop. I also get this error, for pretty much each gene family: Screenshot 2024-06-25 at 11 29 45

heche-psb commented 1 week ago

Hi, thanks for your interest in using wgd v2. You usage with wgd dmd and wgd ksd is correct. The log information also looks normal. The warning for the family with stripped alignment length as 0 is not an error but a common feature for large families (for instance, >200 gene members) that after removing all the gap-containing columns (any column with '-') there is no column left and it's thus impossible to calculate Ks. If you want to calculate the Ks values for these large families anyhow, you may add the flag option --pairwise in wgd ksd command to calculate Ks values based on local pairwise alignment of each pair rather than of the whole family, which is expected to be less gappy. If your run with wgd ksd stopped unexpectedly, the first thing I suggest you to check is whether you gave sufficient memory to each thread you assigned.

EmilyPhelps commented 1 day ago

Thanks for the response- How would I go about checking if the memory is limiting the process? Even when I use very few threads it stops. I have 126G of memory and 64 cores, but the most I've used for this is 40, the fewest is 4 and it stops each time.

EmilyPhelps commented 4 hours ago

Hi - Just to provide more information here. I've been trying to run a subset of ~3K sequences on 4 nodes now for about 16 hours and it seem to get stuck.

Screenshot 2024-07-04 at 09 26 37

It looks like its still running but when I check the htop there is nothing running (apart from angsd which I just started for another project). This time i tried using the pairwise flag.

Screenshot 2024-07-04 at 09 26 21