wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
429 stars 47 forks source link

Summary Statistics in HTML Report #384

Open yoshinak1 opened 2 weeks ago

yoshinak1 commented 2 weeks ago

I have encountered a potential issue in the HTML report generated by Nanoplot when using the --barcoded option. Normally, when data is filtered, the report shows both "Summary statistics prior filtering" and "Summary statistics after filtering," which display different values as expected. However, when I run the same analysis with the --barcoded option enabled, both sections appear to have identical values, and both seem to reflect the values after filtering.

Could you please confirm if this is the intended behavior, or if it might be an issue? I would appreciate any clarification or advice you can provide.

Thank you for your time and support.

Best regards,

wdecoster commented 2 weeks ago

Hi, thanks for your question. I will look into this, but I am very busy at the moment so this might take a while.

Best, Wouter

yoshinak1 commented 2 weeks ago

Hi Wouter,

Thanks for your reply, I know you're busy. This might not be entirely correct, but I noticed something that could be an issue in your code, so I wanted to point it out just in case.

In the make_stats function, while the filtered data is saved with a suffix (e.g., suffix="_post_filtering"), for barcode-specific data, the suffix is ignored, and the file name is always "NanoStats_barcoded.txt". As a result, the pre-filtered data is not saved correctly and gets overwritten by the post-filtered data.

Specific Problem Location:

def make_stats() ... if settings["barcoded"]: barcodes = list(datadf["barcode"].unique()) statsfile = settings["path"] + "NanoStats_barcoded.txt" # suffix is ignored here stats_df = nanomath.write_stats( datadfs=[datadf[datadf["barcode"] == b] for b in barcodes], outputfile=statsfile, names=barcodes, as_tsv=tsv_stats, ) Here, the file name for statsfile does not include the suffix (such as suffix="_post_filtering"), so the pre-filtered and post-filtered data are not saved separately and are overwritten in the same file.

Best,

wdecoster commented 2 weeks ago

Good catch! I can fix it later, but feel free to open a pull request if you want. Thanks for finding the problem :-)

wdecoster commented 1 week ago

This should be fixed in v1.41.1 :-)