nlesc-ave / ave-rest-service

visualize (clustered) single-nucleotide variants across genomes
Apache License 2.0
1 stars 0 forks source link

Use consensus sequence for each sample #36

Closed sverhoeven closed 7 years ago

sverhoeven commented 7 years ago

Now the ref sequence is taken and the snps are replaced from ref to IUPAC ambiguity code.

This ignores the indels, we should generate sequence which also takes indels into account.

sverhoeven commented 7 years ago

Using bcftools

gunzip S_lycopersicum_chromosomes.2.40.fa.gz
bgzip S_lycopersicum_chromosomes.2.40.fa
samtools faidx S_lycopersicum_chromosomes.2.40.fa.gz
samtools faidx S_lycopersicum_chromosomes.2.40.fa.gz  SL2.40ch06:1000-100000 > ref.sam.fa
bcftools view -O b -o RF_001_SZAXPI008746-45.bcf RF_001_SZAXPI008746-45.vcf.gz
bcftools index RF_001_SZAXPI008746-45.bcf
bcftools view -O b -o RF_002_SZAXPI009284-57.bcf RF_002_SZAXPI009284-57.vcf.gz
bcftools index RF_002_SZAXPI009284-57.bcf
bcftools merge -O b -o RF_002_SZAXPI009284-57__RF_001_SZAXPI008746-45.bcf RF_001_SZAXPI008746-45.bcf RF_002_SZAXPI009284-57.bcf
bcftools index RF_002_SZAXPI009284-57__RF_001_SZAXPI008746-45.bcf
cat ref.sam.fa | bcftools consensus --sample RF_001_SZAXPI008746-45 --iupac-codes RF_002_SZAXPI009284-57__RF_001_SZAXPI008746-45.bcf > alt.45.fa
cat ref.sam.fa | bcftools consensus --sample RF_002_SZAXPI009284-57 --iupac-codes RF_002_SZAXPI009284-57__RF_001_SZAXPI008746-45.bcf > alt.57.fa

Using twoBitToFa failed because 0 vs 1 start, causing ref of variant to not match ref sequence

sverhoeven commented 7 years ago

Duplicated of #26