tzcoolman / FACS-OLD

0 stars 2 forks source link

Sampling datasets before running queries on them #17

Closed brainstorm closed 11 years ago

brainstorm commented 12 years ago

Datasets can be huge and sometimes it suffices to have a subsample of the file to screen against (for contamination screening).

Fastq_screen goes through the file and subsamples it to the given nr of reads (2.000.000 is a reasonable default).

tzcoolman commented 12 years ago

i dont understand what you are talking about. My program has such sampling rate in fact, '-s'. 0.5 means exam 50% of the total. '1' means 100%

brainstorm commented 11 years ago

You're right! My bad, I should test it, closing...