esteinig / nanoq

Minimal but speedy quality control for nanopore reads in Rust :bear:
MIT License
109 stars 9 forks source link

Send summary stats to stdout #31

Closed mbhall88 closed 2 years ago

mbhall88 commented 2 years ago

When using the -s option to get the report, I think it would be better to have the output go to stdout.

That way the logging can be separated from the summary.

esteinig commented 2 years ago

Hey yep agree, I had originally kept it on stderr to be consistent with filters (when getting a summary report of the filtered data) but given that there is no output with -s that makes more sense. Will fix it up for the next minor release - do you need it soon?

mbhall88 commented 2 years ago

No rush

kaiseriskera commented 2 years ago

Hi, would love to have this feature too as I'm running NanoQ as part of a snakemake pipeline which requires the fastq.gz reads as output for the next rule, but will also love it if summary stats could be generated in a separate output file.

esteinig commented 2 years ago

@kaiseriskera at the moment you can also redirect stderr to write the stats to file (bash):

nanoq -f -s -i test.fq 2> report.txt

combined with read output (no -s)

nanoq -f -l 1000 -i test.fq 1> filter.fq 2> report.txt

Did you mean a specific command-line option for the report output file otherwise? Shouldn't be too complex.

esteinig commented 2 years ago

Note that the stderr redirect in the first example would change in this feature to:

nanoq -f -s -i test.fq 1> report.txt

(bash)

esteinig commented 2 years ago

I think the explicit report file option is not a bad idea since file descriptors may not be very intuitive.

Alternative 1 (current):

Alternative 2:

Alternative 3:

I'm thinking three might be the most intuitive as it reports everything to stdout unless specified explicitly by file.

kaiseriskera commented 2 years ago

Hi Eike

I'm interested in the report.txt and fastq.gz read both being reported to stdout as two separate files. This might not be the best example but for instance, how NanoPlot processes read.bam to produce a bunch of separate files which I can specify in my snakemake output like this:

rule NanoPlot: input: "read.bam" output: "read_NanoPlot-report.html", "read_NanoStats.txt"

So in the case of NanoQ, it would be something like this:

rule NanoQ: input: "unfiltered_read.gz" output: "filtered_read.gz", "filtered_read-report.txt"

I'm not sure if this is what you mean with the implementation of alternative 3, but I hope I make sense

Best Kaiser

On Fri, Apr 29, 2022 at 11:38 PM Eike Steinig @.***> wrote:

I think the explicit report file option is not a bad idea since file descriptors may not be very intuitive.

Alternative 1 (current):

  • -s report output to stderr (nanoq -s 2> report.txt)
  • reads and report to stdout and stderr (nanoq -f 1> reads.fq 2> report.txt)

Alternative 2:

  • -s report output only to stdout (nanoq -s > report.txt)
  • reads and report to stdout and stderr (nanoq -f 1> reads.fq 2> report.txt)

Alternative 3:

  • -s report output only to stdout (nanoq -s > report.txt)
  • reads to stdout (nanoq -f > reads.fq)
  • report when reads output (no -s) to explicit file (nanoq -f --report report.txt > reads.fq)

I'm thinking three might be the most intuitive as it reports everything to stdout unless specified explicitly by file.

— Reply to this email directly, view it on GitHub https://github.com/esteinig/nanoq/issues/31#issuecomment-1113841856, or unsubscribe https://github.com/notifications/unsubscribe-auth/AXBAPZ5RIFPZ3PVVOVXA5WDVHRQHRANCNFSM5SMTG4AA . You are receiving this because you were mentioned.Message ID: @.***>

esteinig commented 2 years ago

Gotcha, thanks for clarifying this! I will implement the output changes for the next minor version. In the meantime, feel free to use file descriptors for redirecting the output of stdout (reads when filtering) or stderr (summary when using filters or -s option) to separate files that can then be used as outputs in your Snakemake rule:

nanoq -f -l 1000 -O g -i zymo.fq 1> reads.fq.gz 2> report.txt
esteinig commented 2 years ago

@mbhall88 @kaiseriskera have a look at this pull request: #32

Michael, what do you think, should the standard error report in the filtering output be removed and replaced entirely by the explicit report file output? I think that might be clearer?