Closed bkmgit closed 8 months ago
Hi @bkmgit, apologies for the slow response, and thanks for the suggestion!
Just to clarify, are you referring to the exercise about the total number of sequences or the one about sequences that contain at least 3 consecutive Ns?
In the first case, the script is counting number of lines in the fasta file. Provided each sequence is formed by four lines, your number of sequences is total of lines/4.
In the second case, the grep command will search for lines with the "NNN" pattern and copy them to the bad_reads.txt text file. Then the same command and logic as in the previous exercise are used to count the number of sequences with this "NNN" pattern.
This information is included in the solutions provided. What would you expect as a demonstration?
The exercise https://datacarpentry.org/shell-genomics/instructor/04-redirection.html#exercise-1 does not complete the calculation. Demonstrate how to do this.