xiaoming-liu / stairway-plot-v2

The stairway plot is a method for inferring detailed population demographic history using the site frequency spectrum (SFS) from DNA sequence data.
Other
31 stars 4 forks source link

impossible to introduce principale Stairway_fold_training_testing7 #10

Open xuefenfei712 opened 2 years ago

xuefenfei712 commented 2 years ago

Dear Xiaoming Thanks for your useful software, I am studing it, however got the error report below:

my work flow is like below: first: angsd -fold 1 to get sfs second: test.blueprint file prepared as (154 diploid individuals ):

popid: test # id of the population (no white space) nseq: 308 # number of sequences L: 42060 whether_folded: true # whethr the SFS is folded (true or false) SFS: 15676.631533 4.233353 3.800625 6.325086 0.000000 3.797180 2.047039 1.602854 3.445803 1.112864 0.000000 0.000000 1.552678 0.756042 2.694897 0.000000 0.000000 0.000000 0.960382 0.000000 1.005029 0.000000 1.020156 0.000000 0.000000 0.000000 1.014360 0.000000 0.000000 0.000000 0.947147 0.000000 0.000000 0.000000 1.515205 0.000026 2.451939 0.000000 0.066351 2.440779 2.578674 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.076093 0.000001 0.000000 0.000000 2.922012 0.001728 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.812723 0.260496 0.926945 0.000003 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 smallest_size_of_SFS_bin_used_for_estimation: 1 # default is 1; to ignore singletons, uncomment this line and change this number to 2 largest_size_of_SFS_bin_used_for_estimation: 154 # default is nseq/2 for folded SFS pct_training: 0.67 # percentage of sites for training nrand: 76 153 180 306 # number of random break points for each try (separated by white space) project_dir: test stairway_plot_dir: test_out ninput: 200 # number of input files to be created for each estimation

random_seed: 6

output setting

mu: 7.242e-9 # assumed mutation rate per site per generation year_per_generation: 8 # assumed generation time (in years)

plot setting

plot_title: TEST xrange: 0.1,10000 # Time (1k year) range; format: xmin,xmax; "0,0" for default yrange: 0,0 # Ne (1k individual) range; format: xmin,xmax; "0,0" for default xspacing: 2 # X axis spacing yspacing: 2 # Y axis spacing fontsize: 12 # Font size

but why it showd not impossible Stairway_fold_training_testing7 and test/input/test-189.306_0.67.addTheta

appreciate for your kind reply

Best Xue

AudeCaizergues commented 2 years ago

Hi Xue,

Where you able to find a solution to this problem ?

Thanks,

Aude

xuefenfei712 commented 2 years ago

sorry, no answer

Mvwestbury commented 2 years ago

Your input SFS looks quite strange with a lot of 0 What was the exact command in ANGSD you used to generate it?

xuefenfei712 commented 1 year ago

Thankyou for your reply, however I changed my sfs, it also show the same error

Erreur : impossible de trouver ou charger la classe principale Stairway_unfold_training_testing7

popid: Yak # id of the population (no white space) nseq: 20 # number of sequences L: 157236 whether_folded: false # whethr the SFS is folded (true or false) SFS :125008 5854 3360 2380 1693 1289 1076 845 762 922 431 387 314 265 262 201 203 156 168 smallest_size_of_SFS_bin_used_for_estimation: 1 # default is 1; to ignore singletons, uncomment this line and change this number to 2 largest_size_of_SFS_bin_used_for_estimation: 9 # default is nseq/2 for folded SFS pct_training: 0.67 # percentage of sites for training nrand: 78617 157234 235851 # number of random break points for each try (separated by white space) project_dir: Yak stairway_plot_dir: Yak_out ninput: 200 # number of input files to be created for each estimation

random_seed: 6

output setting

mu: 7.242e-9 # assumed mutation rate per site per generation year_per_generation: 8 # assumed generation time (in years)

plot setting

plot_title: Yak xrange: 0.1,10000 # Time (1k year) range; format: xmin,xmax; "0,0" for default yrange: 0,0 # Ne (1k individual) range; format: xmin,xmax; "0,0" for default xspacing: 2 # X axis spacing yspacing: 2 # Y axis spacing fontsize: 12 # Font size

Thankyou! Best Xue

Mvwestbury commented 1 year ago

Could it be because of nrand: 78617 157234 235851

Those seem like very large numbers.

In the readme it suggests to try 4 numbers: (nseq-2)/4 (nseq-2)/2 (nseq-2)*3/4 and nseq-2 In your case with nseq 20 it would be 5 9 14 18

L: should also be the sum of all sites including invariant sites (so including the first and last columns form the realSFS output). It seems strange that you have such a low number of invariant sites. Can you send the entire ANGSD command?

Fengyaa commented 1 year ago

I encountered the same error. It turns out that the parameter 'stairway_plot_dir' should be the folder containing all the .jar and .class files of stairwayplot. It should be set as 'stairway_plot_es' as default in the example blueprint file if you didn't change the path.

abcdefghijklmn97 commented 1 year ago

Dear Xiaoming Thanks for your useful software, I am studing it, however got the error report below:

my work flow is like below: first: angsd -fold 1 to get sfs second: test.blueprint file prepared as (154 diploid individuals ):

popid: test # id of the population (no white space) nseq: 308 # number of sequences L: 42060 whether_folded: true # whethr the SFS is folded (true or false) SFS: 15676.631533 4.233353 3.800625 6.325086 0.000000 3.797180 2.047039 1.602854 3.445803 1.112864 0.000000 0.000000 1.552678 0.756042 2.694897 0.000000 0.000000 0.000000 0.960382 0.000000 1.005029 0.000000 1.020156 0.000000 0.000000 0.000000 1.014360 0.000000 0.000000 0.000000 0.947147 0.000000 0.000000 0.000000 1.515205 0.000026 2.451939 0.000000 0.066351 2.440779 2.578674 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.076093 0.000001 0.000000 0.000000 2.922012 0.001728 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 3.812723 0.260496 0.926945 0.000003 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 smallest_size_of_SFS_bin_used_for_estimation: 1 # default is 1; to ignore singletons, uncomment this line and change this number to 2 largest_size_of_SFS_bin_used_for_estimation: 154 # default is nseq/2 for folded SFS pct_training: 0.67 # percentage of sites for training nrand: 76 153 180 306 # number of random break points for each try (separated by white space) project_dir: test stairway_plot_dir: test_out ninput: 200 # number of input files to be created for each estimation #random_seed: 6 #output setting mu: 7.242e-9 # assumed mutation rate per site per generation year_per_generation: 8 # assumed generation time (in years) #plot setting plot_title: TEST xrange: 0.1,10000 # Time (1k year) range; format: xmin,xmax; "0,0" for default yrange: 0,0 # Ne (1k individual) range; format: xmin,xmax; "0,0" for default xspacing: 2 # X axis spacing yspacing: 2 # Y axis spacing fontsize: 12 # Font size

but why it showd not impossible Stairway_fold_training_testing7 and test/input/test-189.306_0.67.addTheta

appreciate for your kind reply

Best Xue

Hi Your blueprint file should be set to "stairway_plot_dir: stairway_plot_es"