Collect sequence counts during prep file generation

Thanks both.

@antgonza Yes, the code looks at all the outputs from mg-scripts. And I think it would make sense to have this there, but at the time when "seqpro" (the scripts that creates prep files) came to be we didn't have fully fledged mg-scripts structure/repo. Happy to move it over, but I don't think that's critical (for now). (2) This is not intended to be run in a notebook, this is intended to be run at the end of mg-scripts, when the prep file is generated. And the parsing of the log files should be fairly quick. For example, in a run you would:

Call bclconvert
Call QC (adapter trimming and human filtering)
Run seqpro (to generate the qiita prep file and sequence counts based on the run outputs)
Run MultiQC
Copy sequence data to corresponding Qiita studies together with the generated prep.

Down the road, we should create the preparation based on the sequence data and the prep, but that involves some changes in the klp plugin.

On Sep 24, 2021, at 6:06 AM, Antonio Gonzalez @.***> wrote:

@antgonza approved this pull request.

Looks good, thank you. Some of the code seems similar to what's being done in mg-scripts, right? As it basically checks it's outputs so (1) do you think it should live there or what's the plan to integrate? Also, (2) the other concern, is how long will running these changes/code take in a real run and would running in a notebook actually work?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

biocore / metagenomics_pooling_notebook

Collect sequence counts during prep file generation #35