Final project submission

@jelmerp - here is my work for the final project submission!

I am pretty happy to have pivoted to using an nf-core pipeline due to its ease of use. One problem I ran into unrelated to the code (pretty sure lol) was with my denoising. Each time it ran (either independently or through ampliseq) I was getting really high numbers of chimeric sequences (>90%!). My thinking is that a different set of primers were used than the ones I am assuming in the pipeline, but I am still waiting to hear back from my collaborators to see if that is the case. Any ideas what I could try looking into otherwise? It feels unrealistic that nearly all my sequences are getting flagged as non-biological even after trimming so I'm willing to cover any and all bases to be sure.

Regardless, thanks for teaching this class! It has been very informative and empowering :)

-Peter

Presentation

Aspect	Max. points	Your points
Technical content	3	3
Contextualization	3	3
Delivery	3	3
Clarity	3	2
Questions for others	3	1
total	15	12

Great presentation! A couple more questions for others would have been nice.

Final submission

Aspect	Max. points	Your points	Comment
Project organization	4	3	See below.
Project background and documentation	4	3	Comments on next steps after the Ampliseq pipeline would have been good.
Good practices in scripts	4	4
Workflow reproducibility	4	3	You might have included e.g. in the README the command to install Nextflow with Conda.
Slurm jobs at OSC	3	3
Project/coding quality	3	3
Version control	3	3	Nicely done with fine-grained commits.
total	25	22

Nicely done, Peter, and I'm glad to hear this course has been helpful for you! A few comments:

You might have moved the QIIME2 scripts into an dir called archive or so, to make clear they are not part of the current workflow. But it's good that you kept them in the repo, so I had a bit more code to look at.
File organization:
- It would have been good to move the Slurm log files out of your main working dir at OSC.
- You have a work dir that's probably from a test/earlier run of the Ampliseq pipeline, would have been good to remove.
Minor, but using backticks (`) forcode formatting` would have been nice in the README.

As for your issue with chimeric sequences, from the Ampliseq results overview from your results folder shown below it doesn't look like there are chimeras at all (the nonchim number is as high as the number to the left of it, i.e. no ASVs were deemed to be chimeras). BUT you are losing nearly all your reads much earlier, at the DADA2 filtering step. This may have to do with the read quality: take a look at the FastQC output and see if your quality scores perhaps look poor. Adjusting, especially, the DADA2 parameter "maxEE" (max. expected errors per reads, which the pipeline should have an option for) may help! Let me know if you need any more help with that moving forward.

phoward42 / APP-16S-mice-study

Final project submission #2

Presentation

Final submission