lcdb / lcdb-workflows

DEPRECATED. Please see https://github.com/lcdb/lcdb-wf
MIT License
1 stars 0 forks source link

RSeQC has many tools...which ones do we focus on? #15

Closed MediciPrime closed 5 years ago

MediciPrime commented 8 years ago

http://rseqc.sourceforge.net/ <-- RSeQC Documentation w/all tools

daler commented 8 years ago

Here are the tools, and I've checked the ones that should be wrapped. I've left unchecked the ones that duplicate FastQC or that perform utilities (like converting BAM to FASTQ) that we don't immediately need or that might be faster with other methods (e.g., seqtk will be way faster for BAM -> FASTQ).

@jfear, not sure if you want the *_profile.py scripts, which report on CIGAR strings (actually it would be way more efficient just to do all the profiling in one pass of the BAM, but not worth re-writing . . . ). Also not sure if you'd want bam_stat.py.

jfear commented 8 years ago

Sending list to Zhenxia in our lab because she has used it the most. She is out today and probably tomorrow so may not be back to you until Monday.

jfear commented 8 years ago

I know Zhenxia uses bam_stat, but I don't think the output is too unique. Probably can get by without it.

chenzhenxia119 commented 8 years ago

It depends on the purpose.
I use bamstat.py to tell the splice reads, and thus determine DNA contamination (DNA should have fewer splice reads), geneBody_coverage.py to tell the 5' or 3' bias, tin.py to measure the RNA integrity, infer_experiment.py to validate the library layout and strandedness.