sr320 / course-fish546-2018

7 stars 2 forks source link

Alignment file types #60

Closed sr320 closed 5 years ago

sr320 commented 6 years ago

What is the difference between a SAM and BAM filetype. Which one could you use to find variants (ie SNPs) and what would an example command look like?

What is one means by which the textbook indicates this can be visualized?

yaaminiv commented 6 years ago

SAM and BAM files both provide alignment data, but SAM files have a header while BAM files do not. You can use BAM files to find SNPs, using samtools and bcftools.

samtools mpileup -v --region (Generate genotype likelihoods for specified sites in the genome; -v: generate results in variant call format, --region: specify region to generate likelihoods)
bcftools call -v (Filter results so only variant sites remain)

samtools view can be used to visualize SAM or BAM files.

kcribari commented 6 years ago

SAM and BAM filetypes contain alignment data in plain text and binary format respectively. samtools view can be used to visualize both SAM and BAM files. BAM files can be used to search for SNPs with the command samtools mpileup to call up all SNPs and genotypes for multiple individuals. You can use options -region or -r to limit the pileup to certain areas.

Jeremyfishb commented 6 years ago

SAM and BAM files both contain alignment data. SAM files are plain text and BAM files are binary. To search for SNP's you use sorted BAM files. You can sort them using samtools like this: samtools sort somefile.bam somefile_sorted To search, like this: samtools mpileup -v (or -g) referencegenome.fasta somefile.bam > somefile.vcf.gz Then to call, like this: bcftools call -v somefile.vcf.gz > somefile_calls.vcf.gz

hgloiselle commented 6 years ago

both SAM and BAM files contain alignment data. SAM files are plain text while BAM are binary. I used BAM files with samtools mpileup in order to find SNPs. samtools view can be used to visualize them

wsano16 commented 6 years ago

SAM and BAM are the two primary file types we see when working with alignment data. The primary difference between the two is that SAM files are plain text (with a header) and BAM files are binary. The binary files are smaller and more efficient when running samtools.

The first step of calling SNPs is to pass a BAM file to samtools mpileup along with a reference .fasta file. You can either specify a region or each site in the genome. samtools mpileup --region ________ --fasta-ref ________ abcd.bam > abcd.vcf.gz The VCF file can then be fed to bcftools call which can use the genotype information in the VCF file to identify variant sites. bcftools call -v -m _________.vcg.gz > _______calls.vcf.gz

grace-ac commented 6 years ago

SAM (sequence alignment/mapping) and BAM (binary alignment/mapping) are standard formats for sequencing alignment data. SAM files are text with a header section and an alignment section. BAM files are binary.

You can use BAM to find variants such as SNPs using:

samtools mgpileup -v --region _________ \  
--fasta-ref name.fasta name.bam \
> name.vcf.gz

you can use samtools view to look at SAM or BAM

calderatta commented 6 years ago

SAM and BAM filetypes for sequence alignment data. BAM file data is stored in binary format, while SAM files are in plain text format and contain a header with metadata. BAM files can be used to find variants like SNPs. Eg.

samtools mpileup -v --no-BAQ --region [region] --fasta-ref [name].fasta [name].bam > [name].vcf.gz

We can visualize SAM and BAM files using the subcommand samtools view [.sam or .bam file].

kimh11 commented 6 years ago

SAM and BAM files contain the same alignment information, but the BAM file is in binary format making it a smaller in size and more efficient for computers to analyze.

BAM files can be used to search for SNPs using a two-step process: samtools mpileup -v/-g and then bcftools call -v

You can look at the variants using Integrated Genomics Viewer (IGV).

jgardn92 commented 6 years ago

SAM and BAM are both types of files for storing sequence alignment data. SAM files are plain text and include a header region with relevant information. BAM files are binary and are used with many of the further processing commands in samtools. You can visualize both file types using samtools view . You can use BAM files to search for SNPs using samtools mpileup

melodysyue commented 6 years ago

What is the difference between SAM and BAM filetypes? Both SAM (sequence alignment mapping) and BAM (binary analog of SAM) formats are the standard formats for storing sequencing reads mapped to a reference. They contain extensive amount of metadata about the samples, such as refernec sequence name, sequence length, read group, sequencing plateform, programs used to generate SAM/BAM files. BAM is of smaller size in comparision to SAM, making BAM more efficient for following analyses. SAM is 1-based, BAM is 0-based. Which one could you use to find variants (ie SNPs) and what would an example command look like? I would use BAM files to find variants. samtool use Base Alignment Quality algorithm to prevent erroneous variant calls due to misalignment. First step: use samtool's mpileup subcommand creates pileups from BAM files.

samtools mpileup -u -v --region [region] \ --fasta-ref ref.fasta sample.bam > \ sample.vcf.gz

Second step: the intermediate results are then fed into bcftools call

bcftools call -v -m sample.vcf.gz > sample_calls.vcf.gz

What is one means by which the textbook indicates this can be visualized? samtools view is the general tool for viewing and converting SAM/BAM files.

laurahspencer commented 6 years ago

BAM is the binary format of the SAM file, useful since file sizes are smaller. samtools seems to be the favorite program for working with these file types. To view you can use samtools tview, or IGV. To call SNPs from BAM files use the samtools mpileup option.

magobu commented 6 years ago

SAM (Sequence Alignment Mapping) is one of the most used formats when dealing with high-throughoutput alignment data. BAM is the binary version of SAM.

To find variants you would use BAM format and two sub-commands would be needed to use the sub-command samtools mpileup

You can visualize these files using the samtools sub-command samtools view or also samtools tview. IGV will work too for visualizations.