Closed MediciPrime closed 5 years ago
Here are the tools, and I've checked the ones that should be wrapped. I've left unchecked the ones that duplicate FastQC or that perform utilities (like converting BAM to FASTQ) that we don't immediately need or that might be faster with other methods (e.g., seqtk will be way faster for BAM -> FASTQ).
@jfear, not sure if you want the *_profile.py
scripts, which report on CIGAR strings (actually it would be way more efficient just to do all the profiling in one pass of the BAM, but not worth re-writing . . . ). Also not sure if you'd want bam_stat.py
.
bam2fq.py
bam2wig.py
bam_stat.py
clipping_profile.py
deletion_profile.py
divide_bam.py
FPKM_count.py
geneBody_coverage.py
geneBody_coverage2.py
infer_experiment.py
inner_distance.py
insertion_profile.py
junction_annotation.py
junction_saturation.py
mismatch_profile.py
normalize_bigwig.py
overlay_bigwig.py
read_distribution.py
read_duplication.py
read_GC.py
read_hexamer.py
read_NVC.py
read_quality.py
RNA_fragment_size.py
RPKM_count.py
RPKM_saturation.py
spilt_bam.py
split_paired_bam.py
tin.py
Sending list to Zhenxia in our lab because she has used it the most. She is out today and probably tomorrow so may not be back to you until Monday.
I know Zhenxia uses bam_stat, but I don't think the output is too unique. Probably can get by without it.
It depends on the purpose.
I use bamstat.py to tell the splice reads, and thus determine DNA contamination (DNA should have fewer splice reads),
geneBody_coverage.py to tell the 5' or 3' bias,
tin.py to measure the RNA integrity,
infer_experiment.py to validate the library layout and strandedness.
http://rseqc.sourceforge.net/ <-- RSeQC Documentation w/all tools