Closed jodjo86 closed 1 year ago
I updated to 2.4 and I get a similar error.
toulligqc --report-name QC_duplex \
--barcoding \
--telemetry-source duplex/sequencing_telemetry.js \
--sequencing-summary-source duplex/sequencing_summary.txt \
--html-report-path duplex/QC_duplex.html \
--barcodes barcode01
duplex/QC_duplex.html
ToulligQC version 2.4
* Initialize extractors
* Start Toulligqc info extractor
* End of Toulligqc info extractor (done in 0m0.00s)
* Start Sequencing telemetry extractor
* End of Sequencing telemetry extractor (done in 0m0.00s)
* Start Basecaller sequencing summary extractor
- Load sequencing summary file (0.03 MB used) in 0m0.05s
Traceback (most recent call last):
File "/home/minion/miniconda3/envs/nano/bin/toulligqc", line 10, in <module>
sys.exit(main())
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/toulligqc.py", line 348, in main
extractor.extract(result_dict)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_extractor.py", line 234, in extract
extract_barcode_info(self, result_dict,
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_common.py", line 156, in extract_barcode_info
dataframe_dict["base.fail.barcoded"] = _barcode_bases(extractor, barcode_selection, result_dict,
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/toulligqc/sequencing_summary_common.py", line 347, in _barcode_bases
set_result_value(extractor, result_dict, entry + '.count', sum(count_sorted.drop("unclassified")))
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/series.py", line 4771, in drop
return super().drop(
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/generic.py", line 4267, in drop
obj = obj._drop_axis(labels, axis, level=level, errors=errors)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/generic.py", line 4311, in _drop_axis
new_axis = axis.drop(labels, errors=errors)
File "/home/minion/miniconda3/envs/nano/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 6644, in drop
raise KeyError(f"{list(labels[mask])} not found in axis")
KeyError: "['unclassified'] not found in axis"
Dear @jodjo86,
Thank you for using ToulligQC and for reporting this issue. ToulligQC does not yet support duplex analysis. We plan to address this in the upcoming months!
However, it seems that this issue is related to the barcode section. Could you please provide me with your guppy output files so that I can reproduce the bug?
Regarding your request, I will add the option to specify a barcode range in the next version. Thank you for the suggestion!
Best regards, Ali
Hi @alihamraoui,
Thank you for the quick reply and for considering my request. Here are the guppy output files for my duplex test run with one barcode. guppy_log.zip
I hope this will be useful to you. thanks,
Joel
Hi @jodjo86,
I think I've fixed this issue. Could you please clone it again and give it a trial with your real data?
I have also added the option to specify range for barcodes. you can use : --barcodes barcode01:barcode48
Hope this works!
best, Ali
Hi @alihamraoui ,
Thanks for adding the feature to specify range for barcodes. It seems to work well. If there is a problem I will make a separate issue to simplify follow-up.
I still have an error message for the duplex report :(
thanks, Joel
toulligqc --report-name QC_duplex \
--barcoding \
--telemetry-source duplex/sequencing_telemetry.js \
--sequencing-summary-source duplex/sequencing_summary.txt \
--html-report-path duplex/QC_duplex.html \
--barcodes barcode01
duplex/QC_duplex.html
ToulligQC version 2.4
* Initialize extractors
* Start Toulligqc info extractor
* End of Toulligqc info extractor (done in 0m0.00s)
* Start Sequencing telemetry extractor
* End of Sequencing telemetry extractor (done in 0m0.00s)
* Start Basecaller sequencing summary extractor
- Load sequencing summary file (0.03 MB used) in 0m0.01s
- Extract info from sequencing summary file in 0m0.05s
- Creation of image "Read count histogram" in 0m0.13s
- Creation of image "Distribution of read lengths" in 0m0.10s
Traceback (most recent call last):
File "/home/minion/miniconda3/envs/trim/bin/toulligqc", line 33, in <module>
sys.exit(load_entry_point('toulligqc==2.4', 'console_scripts', 'toulligqc')())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/toulligqc-2.4-py3.11.egg/toulligqc/toulligqc.py", line 388, in main
graphs.extend(extractor.graph_generation(result_dict))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/toulligqc-2.4-py3.11.egg/toulligqc/sequencing_summary_extractor.py", line 279, in graph_generation
add_image_to_result(self.quiet, images, time.time(), pgg.yield_plot(self.dataframe_1d, self.images_directory))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/toulligqc-2.4-py3.11.egg/toulligqc/plotly_graph_generator.py", line 204, in yield_plot
count_x, count_y, cum_count_y = _smooth_data(npoints=npoints, sigma=sigma,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/toulligqc-2.4-py3.11.egg/toulligqc/plotly_graph_common.py", line 205, in _smooth_data
min_arg = np.nanmin(data)
^^^^^^^^^^^^^^^
File "<__array_function__ internals>", line 180, in nanmin
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/numpy/lib/nanfunctions.py", line 350, in nanmin
res = np.amin(a, axis=axis, out=out, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<__array_function__ internals>", line 180, in amin
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 2918, in amin
return _wrapreduction(a, np.minimum, 'min', axis, None, out,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/minion/miniconda3/envs/trim/lib/python3.11/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: zero-size array to reduction operation minimum which has no identity
Hi @jodjo86,
You're getting this error because in this small sequencing summary example, you have only passed filter reads (I should also fix this issue).
For now, you also need to provide the sequencing summary of the failed reads.
--sequencing-summary-source duplex/sequencing_summary_pass.txt \ --sequencing-summary-source duplex/sequencing_summary_fail.txt
I assume that this example is just a subset of your entire sequencing summary?
It will work if you provide the sequencing summary for your entire dataset.
Best regards, Ali
Everything works well. THANKS
Cool.
I'm glad that this can help. I will make sure that in the next versions, you will have the option to use only passed filter reads.
best, Ali
hi I am trying to create a report for a duplex analysis (
guppy_basecaller_duplex
) and and I got this error message.I can send you the two guppy basecaller output files if necessary.
ps. I also have a request. Would it be possible to specify a barcode range for the
--barcodes
argument. I use a lot of barcode and the command quickly becomes very long.--barcodes barcode01,barcode02, ... barcode48