ababaian / serratus

Ultra-deep search for novel viruses
http://serratus.io
GNU General Public License v3.0
250 stars 32 forks source link

Serratus-Lite summarizer script #278

Closed TWilson1012 closed 1 month ago

TWilson1012 commented 1 month ago

I have a quick question about the summary report output from Serratus Lite for the nucleotide alignment. When I run the code provided, switching only the paths to the files and input fastq file, I get a summary file that contains most of the summary statistics, but does not include the section at the end with FASTA hits, and the file does not begin with the SUMZER COMMENT even when I specify that it should (see attached screenshot of a summary report).

I have run the summarizer code on the ERR2756788.bam file to check if the issue was related to my input files, but in this case I am having the same issues with the summary file (i.e. it lacks the FASTA section and comment section, unlike the summary report in the example).

Code Example:

SUMZER_COMMENT=$(echo sra="na",genome="cov3ma",version=200818,date=$(date +%y%m%d-%R)) summarizer="python3 /home/tess/Serratus/serratus_summarizer.py /dev/stdin /home/tess/Serratus/cov3ma.sumzer.tsv ERR.summary /dev/null" samtools view ERR2756788.bam | $summarizer

I am wondering if this is an issue with my code, or perhaps the FASTA portion of the report is only when SRA files are used for input?

Thank you for the help!

Screenshot 2024-05-14 at 8 43 35 AM
ababaian commented 1 month ago

Ah are you referring to This documented section?.

I think we removed that as it was supposed to be a "peak" into the data which was then made obsolete when we started storing .bam files. I would not recommend using this feature, instead retain the .bam files and open them in IGV, this is by all accounts a superior means of inspecting fasta reads and alignments. I'll update the documentation to remove that FASTA section. Sorry about the confusion.

TWilson1012 commented 1 month ago

Yes I was referring to that section! Thank you for your help!