Closed mattloose closed 2 months ago
In addition this request needs to know if it should send read_id or read name - it has not yet been updated to use the read name.
This should close #347 - which will start to appear more as more people update.
I think we should merge #324 before we merge this.
I also think that making this a new plugin is probably the right way to go! It's going to be heavy duplication of the guppy.py plugin - but I suggest we add a deprecation warning when running the guppy
plugin - and state in that we will not support that plugin moving forwards, and it is provided as is, with support for dorado only moving forwards.
@mattloose could we take the sample_rate
from the caller config?
Uhoh spaggetios
==================================================================================== test session starts ====================================================================================
platform linux -- Python 3.11.0, pytest-7.4.3, pluggy-1.3.0
rootdir: /home/adoni5/Projects/readfish
configfile: pyproject.toml
testpaths: tests/*test.py, src/readfish
collected 65 items / 1 error
========================================================================================== ERRORS ===========================================================================================
______________________________________________________________________ ERROR collecting src/readfish/plugins/guppy.py _______________________________________________________________________
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/runner.py:341: in from_call
result: Optional[TResult] = func()
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/runner.py:372: in <lambda>
call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/doctest.py:567: in collect
module = import_path(
../../miniforge3/envs/readfish/lib/python3.11/site-packages/_pytest/pathlib.py:567: in import_path
importlib.import_module(module_name)
../../miniforge3/envs/readfish/lib/python3.11/importlib/__init__.py:126: in import_module
return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1206: in _gcd_import
???
<frozen importlib._bootstrap>:1178: in _find_and_load
???
<frozen importlib._bootstrap>:1149: in _find_and_load_unlocked
???
<frozen importlib._bootstrap>:690: in _load_unlocked
???
<frozen importlib._bootstrap_external>:940: in exec_module
???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
???
src/readfish/plugins/guppy.py:16: in <module>
from pyguppy_client_lib.helper_functions import package_read
../../miniforge3/envs/readfish/lib/python3.11/site-packages/pyguppy_client_lib/helper_functions.py:17: in <module>
from pyguppy_client_lib.client_lib import GuppyClient
E ImportError: NumPy: dtype is already registered
================================================================================== short test summary info ==================================================================================
ERROR src/readfish/plugins/guppy.py - ImportError: NumPy: dtype is already registered
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
===================================================================================== 1 error in 0.35s ======================================================================================
Seems installing ont-pybasecall-client-lib 7.3.10
alongside ont-pyguppy-client-lib
might not work
Yes - we need some kind of conditional install? IT's a bit of a pain this for sure.
Needs the documentation updating - this is confirmed working with latest Minknow /Dorado 7.3 plus, and should work with <= 7.2.
I've bumped the version to 2024.0.0
, should maybe be 2024.1.0
I've moved all tests to test dorado
plugin, and added the new ont-pybasecall-client-lib
to pyproject.toml
Excluded some plugins/files we can't really test from coverage reporting
Final steps:
package_reads
accordinglyguppy.py
Updated to pull sample rate from MinKNOW. But also have found an issue with dorado returning two reads with the same id in the same batch.
Create a new
dorado
plugin to handle the latestdorado_basecall_server
version (7.3.9
), which only accepts connections fromont-pybasecall-client-lib
, deprecatingont-pyguppy-client-lib
.This pull request is to trigger discussion about the best way to switch to the new ONT pybasecaller client.
Rather than introduce this in the guppy plugin I have created a distinct dorado plugin.
This is due to changes in how the reads need to be packaged for the client, which expects some surprising data (e.g sample rate). Given the switch from guppy to dorado at ONT it may make sense in the future for people to be loading a dorado plugin instead of a guppy plugin. Within a few months many users may have forgotten about (or never known about) guppy in the first place.
Related to the above, we need to find a way of providing the sample rate to the basecall client. Currently the base caller has no way of accessing this?
Any suggestions?
Until this is resolved the PR is not fit for merging.