HuffordLab / Wang_et_al._Demography

Repository for the 2017 Genome Biology publication, "The interplay of demography and selection during maize domestication and expansion"
17 stars 4 forks source link

data orgin #1

Open conniecl opened 6 years ago

conniecl commented 6 years ago

Hi, Wang @lepisorus, I want to calculate the D statistic as you show in the script Dstatistics/abba-baba.sh , but I'm confused about the Tripsacum.fa you use as the ancestral genome, since Tripsacum do not have a reference genome ? And how to get the Tripsacum.fa? Sincerely hope can get your advice, and thanks in advance.

/data004/software/GIF/packages/ANGSD/0.614/angsd -doAbbababa 1 -blockSize 1000 -anc /home/lwang/lwang/SRA/Tripsacum/Tripsacum.fa -doCounts 1 -bam ${BAMLIST} -uniqueOnly 1 -minMapQ 30 -minQ 20 -minInd 3 -P 8 -checkBamHeaders 0 -rf /home/lwang/lwang/SRA/scaffoldNamesWithData.txt -out ${OUTFILE}

lepisorus commented 6 years ago

Hi, Thanks for your interests. The tripsacum fasta file was generated via ANGSD, extracting consensus sequence from bam files. Please refer to "angsd.abba-baba.sh" for instructions.

conniecl commented 6 years ago

Hi @lepisorus , Thanks for your kindly explanation, I used the command as you show in "angsd.abba-baba.sh" , after using the unique map bam file and raw bam file, the result(Tripsacum.fa) is very different, may be caused by the coverage? Is there any suggestion about the bam file? Since I'm confused about it. Besides, I also try to caculate the fd_introgression based on your scripts, but why you choose Mexican lowland as the reference population(P1), instead of other landrance? And thanks again

lepisorus commented 6 years ago

The coverage matters when extracting the consensus sequences. The tripsacum data I used was from Hapmap2, see Hufford et al. 2012 Nature Genetics.

We chose Mexican Lowland as the reference population, because that is the population did not receive introgression from mexicana .

On Mon, Nov 20, 2017 at 7:21 PM, conniecl notifications@github.com wrote:

Hi @lepisorus https://github.com/lepisorus , Thanks for your kindly explanation, I used the command as you show in " angsd.abba-baba.sh" , by using the unique map bam file and raw bam file, the result is very different, may be caused by the coverage? Is there any suggestion about the bam file? Since I'm confused about it. Besides, I also try to caculate the fd_introgression based on your scripts, but why you choose Mexican lowland as the reference population(P1), instead of other landrance? And thanks again

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/HuffordLab/Wang_et_al._Demography/issues/1#issuecomment-345885630, or mute the thread https://github.com/notifications/unsubscribe-auth/AGL3WmDF2Q4Rbdn5d6JHp1p6qGp79Izlks5s4iWmgaJpZM4Qj-DD .

-- Li Wang Postdoctor Research Associate Department of Ecology, Evolution and Organismal Biology Iowa State University

Website: lepisorus.github.io