Open davidaray opened 4 years ago
Hi, I'm on the same issue, have you got any methods to solve this ? Thanks!
Qiang
Hi, I'm on the same issue, have you got any methods to solve this ? Thanks!
Qiang
I didn't get anything from the author but I did get some help from another researcher. It involves taking some output data from psmc and then plotting it with Excel or another software package. It's rather easy and I'd be happy to share it with you if you contact me directly.
For anyone interested, here is a workaround for plotting via psmc directly:
This is the documentation and commands from the script I ran. If you're familiar with linux, you should be able to work out the steps. If you have trouble, let me know and I'll try to explain what I did in each step.
fq2psmcfa geomys.fq.gz >geomys.single.psmcfa
psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o geomys.single.psmc geomys.psmcfa
psmc_plot.pl -u 2.2e-9 -g 3.08 -R -p geomys.plotsingle.psmc geomys.psmc
splitfa $PREFIX".psmcfa" > $PREFIX".split.psmcfa"
psmc -N25 -t15 -r5 -p "4+25*2+4+6" -o $PREFIX".psmc" $PREFIX".psmcfa"
psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-{}.psmc $PREFIX"2"$MAP2".split.psmcfa" | sh
seq 100 | parallel -I% --max-args 1 $PSMC/psmc -N25 -t15 -r5 -b -p "4+25*2+4+6" -o round-%.psmc $PREFIX".split.psmcfa" | sh
cat $PREFIX".psmc" round-*.psmc > $PREFIX".combined.psmc"
rm round-*.psmc
psmc_plot.pl -u 2.2e-9 -g $GT -R -p $PREFIX".plot.psmc" $PREFIX".combined.psmc"
I created an excel file that has the concatenated data in columns B-F for taxon 1 and columns I-M for taxon 2. The single-run data are in columns P-T for taxon 1 and W-AA for taxon 2.
Open your excel file and highlight columns B and C.
Click the 'Insert' toolbar and choose 'Recommended Charts'.
Select a 'Scatter' plot.
Right click the x-axis and choose 'Format Axis' change the Axis Options to 'Logarithmic scale'.
Right click any plotted point and choose 'Format Data Series'.
Click the paint can and choose 'Marker' and then 'Marker Options'. Change to 'None'.
Click 'Line' and choose 'Solid line'. Change the transparency and color to whatever suits you. I generally choose a light color.
Right click anywhere in the plotted area and choose 'Select Data'.
Click Series 1 and select 'Edit'. Change the name to 'bootstrap' or whatever you deem appropriate. Don't change any of the other fields.
Click 'Add'. Name the series, 'main' or whatever is appropriate.
Click the Series X box and then select column P.
Click the Series Y box, delete what's there and then select column Q.
Adjust the color for the new line as needed. i usually choose a darker version of the color that I used for the bootstrap.
To add your second data set,, Right click in the plotted area and choose 'Add'. Name the new series 'boostrap' and select column I for Series X and column J for Series Y.
Add the 'main' line as described in steps 10 - 13.
The excel file I created is located here: https://github.com/davidaray/test/blob/master/Pocket_gopher.xlsx
I provided a solution on a previous issue which utilises the multiline mode in psmc_plot.pl
- might find it helpful.
Thanks for the instructions on the combined bootstrap plot @davidaray !
Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.
Thanks in advance, Julia
Thanks for the instructions on the combined bootstrap plot @davidaray !
Just one question - is the estimated effective population size (e.g. column C in your data for Geomys) meant to be multiplied by 1,000 (e^03) or by 10,000 (e^04)? And, is this always the case or is it data-specific? I'm unsure about the units in my output .txt file and I cannot find any official documentation about this.
Thanks in advance, Julia
Julia, I saw this last week and forgot to respond. I apologize. Did you find an answer? I honestly don't remember and our HPCC, where all this work is housed, is down for maintenance this week. I won't be able to access until Monday.
Thanks for your reply @davidaray! No, I still am unsure about the psmc output. Any insight would be really appreciated.
Hello,
I also have the .txt
files and I have been trying to find a way to plot these results using ggplot
. As it has been said, it would be lovely to understand what are each of the 5 columns in the .txt
output, and which one and how should be scaled.
I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with ggplot
though.
Has anyone been luckier?
Many thanks in advance, George.
Thanks @davidaray for the helpful instructions! In the example excel file you have provided, I note that you only use the first 2 columns of the text output which are years from present (X axis) and effective population (Y axis). Any idea what the remaining 3 columns are?
Hello,
I also have the
.txt
files and I have been trying to find a way to plot these results usingggplot
. As it has been said, it would be lovely to understand what are each of the 5 columns in the.txt
output, and which one and how should be scaled.I have tried to scale (X 1,000) the fifth column, but my numbers simply do not match... so I suppose things are a bit more complicated than that. It would be great to be able to plot these results with
ggplot
though.Has anyone been luckier?
Many thanks in advance, George.
You can probably plot it using the first 2 columns years from present (X axis - 1st column) and effective population (Y axis - 2nd column). Not sure what the rest of the columns refer to
include the "-R" in the cmd, and vim the .gp file as you like then run cat sample_plot.gp | gnuplot
I have two independently produced plots for two related species and would like to combine them onto a single set of axes for comparison. I see this in multiple manuscripts but I don't know how to do it for my data.
Does anyone have some clear instructions on how to accomplish this?
I've tried just about every variation of psmc_plot.pl -M "geomys=0.1,thomomys=0.2" -u "geomys=0.0000000022,thomomys=0.0000000022" -g "geomys=3,thomomys=2" twogophers geomys.psmc thomomys.psmc possible with no luck.
David