sr320 / course-fish546-2018

7 stars 2 forks source link

Bedtools suite #61

Closed sr320 closed 5 years ago

sr320 commented 6 years ago

What do you consider the five most relevant BedTools commands based on your research interest?

Please list them and indicate what each one does.

yaaminiv commented 6 years ago

BEDTools allow the user to directly work with data in BED, GTF and GFF filetypes. The five BEDTools command most relevant to me:

magobu commented 6 years ago

Bedtools allow you to perform genomic analysis on multiple files in a variety of formats (BAM, BED, GFF/GTF, VCF). Some of the most useful commands or subcommands that could be relevant to me:

Jeremyfishb commented 6 years ago

Though I won't be using bedtools for my proteomics analysis, here are 5 sub-commands:

intersect tells how much overlap occurs in two ranges getfasta extracts subsets of genome genomecov summarizes coverage over a genome, or chromosome multicov counts how many alignments from a number of BAM files overlap with a number of BED files merge pieces together overlapping ranges

hgloiselle commented 5 years ago

genomecov calculates the level of coverage over a whole genome

intersect finds overlaps

random creates random intervals in a genome

multicov counts coverage at a certain site in multiple BAM files

getfasta use a fasta file to extract part of a genome

wsano16 commented 5 years ago

I could imagine myself using the BEDtools in the following ways: genomecov - to determine whether the I have uniformity of sequencing across a genome during WGS intersect - to find antibiotic genes in a reference genome jaccard - to compare the genomes of geographically isolated bacterial populations bamtobed - to facilitate switching between samtools and bedtools getfasta - extract sequence information from overlaps to BLAST against well-annotated bacterial genomes

grace-ac commented 5 years ago

intersect --> finds overlaps between ranges annotate --> finds how much coverage each file has over another input file
getfasta --> pull out sequences for a given set of ranges
multiliner --> finds overlaps of a given feature between files merge --> merges overlapping ranges into one range

kcribari commented 5 years ago

I will not be using BedTools, but the ones I found interesting are:

flank - will create a portion of basepairs on each side of an indicated sequence fisher - a fisher's exact test to see the similarities and differences between two files groupby - creates summary statistics based on groups in an indicated column of data overlap - shows the overlap or distance between features in a file subtract - searches for features in file 2 that overlap with features in file 1. Overlapping features found in file 2 are removed from file 1

calderatta commented 5 years ago

I will likely not be using BedTools for my current project. intersect - finds the overlapping regions between two range files slop - adds basepairs to the ends of ranges in a singe .bed file genomecov - summarizes feature coverage, providing depth, bases covered, chromosome bases, and proportion of bases covered annotate - shows how much coverages one file has over another multicov - counts alignments in many BED files to a BAM alignment file

kimh11 commented 5 years ago

I would use BedTools for these commands:

genomecov: get depth information at each genome position using the -d flag intersect: find overlap between the sequence alignments and genes flank: add a specific number of bases before or after a given range getfasta: extract sequences in fasta format for a given range sort: sort by chromosome or scaffold size or by score

jgardn92 commented 5 years ago

I won't be using bedtools in my project but 5 interesting commands are: intersect - finds overlaps between ranges (which can also return all non-overlapping ranges with the -v option) slop - grows a range by a specified number of basepairs,, can expand only left or right side with the options -l -r flank- finds flanking regions of ranges, particularly useful for finding promoter regions of genes genomecov - summarizes genome coverage over the whole genome merge - merges overlapping ranges into one range

melodysyue commented 5 years ago

I won't be using bedtools for my metabarcoding projects, but the following subcommands are very interesting to me: intersect compute the overlaps between two sets of ranges slop grow ranges flank extract flanking ranges genomecov summarize the coverage of features along chromosome sequences merge merge overlapping ranges into a single range

laurahspencer commented 5 years ago

intersect - can be used to extract overlapping regions between .bed files. Adding -wb -wa flags returns the entire ranges (not just the overlapping ranges), not just the overlapping ones. -s specifies that features must be on the same strand. genomecov - summarizes coverage of features along chromosome sequences, helpful to see % coverage stats. merge - merges overlapping ranges into a single range, basically removing duplicated data. annotate - annotates coverage of one track file against another. Could be used to look at unionbedg -merges multiple BedGraph files into one file, seems very useful.