FASTQC replacement stats

partial solution implemented #183 with 'bases my cycle' and 'quality by cycle' matrices for each read added to the json output. Last thing I'd like to add before closing is something for "over-represented sequences" Idea is store all kmers found in the first N sequences, then count their occurrence in the whole dataset, print out anything that reaches a certain threshold of occurrence, say 0.1%. Parameters might be -k [ --kmer ] arg (=36)
-r [ --kmer-offset ] arg (=1) -n [--number_of_reads] arg(=500000) number of reads to establish kmer set -o [--occurence] arg(=0.001) The occurence of a kmer to output

s4hts / HTStream

FASTQC replacement stats #106