Closed lennijusten closed 9 months ago
~TODO: update dependencies section of readme with RiboDetector install instructions~
@jeffkaufman I addressed the requested changes in ec27488 and updated the README with a description of the output data.
Note that I changed the output to be the number of rRNA reads instead of the number of non-rRNA reads. It was possible to do this with minimal additional computation since I tallied the total sample reads during the average read length step.
I will need to re-run Wu 2020 and Bohl 2022 with this new output configuration.
Initial implementation of a new stage called
ribocounts
that saves a text file to AWS for each sample with the number of non-rRNA reads in the sample. Tested on Bohl 2022 and Wu 2019.--len
that should be set to roughly the average length of the input reads. I've added an option to perform the avg read length calculation for each input file, but it requires an additional pass over the unzipped input files. Open to discuss how we can improve this step.