Closed AroArz closed 3 years ago
Related to this topic as well I noticed that scripts/processing_summary.py
defines after_host_removal
by number of classified reads in the kraken2 log file. Unless I'm mistaken it should be defining it by Unclassified?
def parse_kraken2_logs(logfiles):
for logfile in logfiles:
with open(logfile) as f:
sample_name = Path(logfile).stem.split(".")[0]
for line in f:
if " classified" in line:
yield {"Sample": sample_name, "after_host_removal": int(line.strip().split()[0])}
example of log file
Loading database information... done.
50376689 sequences (12703.60 Mbp) processed in 371.331s (8139.9 Kseq/m, 2052.66 Mbp/m).
21664417 sequences classified (43.00%)
28712272 sequences unclassified (57.00%)
output_dir/host_removal/sample_host_1.fq to output_dir/host_removal/sample_host_1.fq.gz
output_dir/host_removal/sample_host_2.fq to output_dir/host_removal/sample_host_2.fq.gz
output_dir/host_removal/sample_1.fq to output_dir/host_removal/sample_1.fq.gz
output_dir/host_removal/sample_2.fq to output_dir/host_removal/sample_2.fq.gz
fix
Updated pandas to 1.2.1
So the issue first reported in this thread was solved entirely by updating pandas? Odd!
Is this issue resolved by your update of pandas now then? If so, please close this issue @AroArz
branch
158-biobakery-update
was used, however these sections were unmodified so the problem is likely present inmaster
branch as wellconfig
shell
rules
Aforementioned error was produced by
preprocessing_summary
, a similar error fromplot_proportion_host
log
log files were empty
command line
Executed with both
--use-conda
and--use-singularity
. Correct image was pulled.fix
Updated pandas to 1.2.1