This mini-project involves assessing various variant calling pipelines inorder to come up with an optimized/standardized variant calling pipeline for analyzing insect and pathogen data.
To design a standard variant calling pipeline that is effective and reliable for analyzing insect and pathogen data, that will be adopted across various regions of the world.
To evaluate existing varinat calling pipelines, fill in existing gaps and modify the pipelines to suit anlysis of insect and pathogen data.
These are also known as single nucleotide polymorphisms (SNPs) and involve one nucleic acid substitution
They can be used to predict how an individual would respond to drugs of a particular class, susceptibilty to certain environmental stimulus and well predisposal to certain diseases
They can be helpful in tracking inheritable of diseases within a family.
They are of two types:
Transition: This involves the interchange between purines; between Adenine and Guanine and interchange between pyrimidines; between Cytosine and Thymine
Transversion: This involves interchange between a purine and a pyrimidine for example, between Adenine and Cytosine
These are classified as small genetic mutations and involve the insertion or deletion of nucleotide bases and can range from one to hundreds of base-pairs in length
This involves genetic variation that occur over a larger DNA sequence it includes copy number variation and chromosomal rearrangements common types of structural variations include: deletion, inversion,insertion, duplication, and copy number variation.
Variant calling is the process by which variants are identified from sequence data.
The goal is to obtain a vcf (variant calling format) file that shows variants for all individuals in a given population
Here is a link to our project road map