Closed GoogleCodeExporter closed 9 years ago
You may have been using an old version of msformat.cpp that didn't accommodate
the new mutation age field. I've checked in a new revision that removes the
executable from the source tree. Please check this one out. Run "make all"
and see if anything changes.
Original comment by gche...@gmail.com
on 3 Jan 2013 at 11:11
Tried with latest checkout and compiled, but the effect is still there. Let me
know if I can supply you any more information.
Original comment by gregor.g...@gmail.com
on 3 Jan 2013 at 11:52
Please post a command line with a random seed -s specified and also post your
stdout and stderr files.
Original comment by gche...@gmail.com
on 4 Jan 2013 at 12:29
Bellow is a modified script with seed option. Attached are also bzip2 archives
for stdout and stderr files.
#-------------------- START SHELL SCRIPT ---------------------------------------
#!/bin/bash
# Set up number of haplotypes
nHaps=400
# Call MaCS
./macs $nHaps \
810000000 \
-s 19791123 \
-t 0.00000036 \
-r 0.00000036 \
-eN 0.05 2 -eN 0.1 4 -eN 0.15 6 -eN 0.2 8 -eN 0.25 10 -eN 0.3 12 -eN 0.35 14 -eN 0.4 16 -eN 0.45 18 -eN 0.5 20 -eN 1 40 -eN 2 60 -eN 3 80 -eN 4 100 -eN 5 120 -eN 10 140 -eN 20 160 -eN 30 180 -eN 40 200 -eN 50 240 -eN 100 320 -eN 200 400 -eN 300 480 -eN 400 560 -eN 500 640 1>stdout.txt 2>stderr.txt
# Format as in ms
cat output.txt | ./msformatter > haplotypes.txt
# Tear appart the output
# ... header
tail -n +6 haplotypes.txt > Intermediate.txt
# ... haplotypes
tail -n +2 Intermediate.txt > MacsHaplotypes.txt
# ... map
head -n 1 Intermediate.txt | sed -e 's/positions: //' > PhysicalMapInput.txt
# ... number of segregating sites
grep segsites haplotypes.txt | sed -e 's/segsites: //' > SegSites.txt
# ... get nice format with spaces
sed -e 's/0/ 0/g' -e 's/1/\ 1/g' MacsHaplotypes.txt > tmp
# ... save outgroup haplotypes in two files
head -n $nHaps tmp > MacsHaplotypes.txt
#-------------------- STOP SHELL SCRIPT ---------------------------------------
Original comment by gregor.g...@gmail.com
on 4 Jan 2013 at 1:00
Attachments:
Argh, I have changed in script the resulting stdout and stderr filenames in
macs call to make it easier for you to spot. However, I forgot to change it
later in the script. To make the above script work replace stdout.txt with
output.txt.
Original comment by gregor.g...@gmail.com
on 4 Jan 2013 at 1:02
it's not clear to me from stdout which sites were monomorphic (no 1s). can you
give me example of site IDs?
Original comment by gche...@gmail.com
on 4 Jan 2013 at 1:11
I think I know what's going on. stay tuned
Original comment by gche...@gmail.com
on 4 Jan 2013 at 1:21
Can you try my latest revision of msformat.cpp? I found a bug that is most
likely the culprit.
Original comment by gche...@gmail.com
on 4 Jan 2013 at 1:28
No, I was not saying that sites were monomorphic. I noticed that some
haplotypes were "monomorphic" - several instances of haplotypes with just 0 or
1 (at least very long stretches). I am attaching screenshots of haplotype.txt
file (in ms format) where you can clearly see the differences, while the call
was the same, just version of macs and msformatter are different.
If you check allele frequencies (=mean of allele codes for each site) from the
new and old version you will see huge difference as well - shown in initial
post.
Original comment by gregor.g...@gmail.com
on 4 Jan 2013 at 1:51
Attachments:
look at the last diff under change log for msformat.cpp
http://code.google.com/p/macs/source/browse/trunk/msformat.cpp
and check out latest to see if this fixes it
Original comment by gche...@gmail.com
on 4 Jan 2013 at 1:56
Yes, now things look ok! Attached is a plot of allele frequencies (y axis) over
sites (x axis).
Thanks!!!
Original comment by gregor.g...@gmail.com
on 4 Jan 2013 at 2:03
Attachments:
great thanks for pointing this bug out.
Original comment by gche...@gmail.com
on 4 Jan 2013 at 2:04
Original issue reported on code.google.com by
gregor.g...@gmail.com
on 3 Jan 2013 at 10:50Attachments: