LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
169 stars 33 forks source link

Readfish targets TypeError #179

Closed mahreenkn closed 1 year ago

mahreenkn commented 2 years ago

Hi all,

I'm looking forward to running readfish on some samples, but have been testing it out on the bulk playback fast5 file provided in the tutorial. I've got Guppy GPU up and running well (which was a previous issue I was stuck on) but I'm getting some errors when trying to run the readfish targets command.

I've got readfish set up according to the dev branch, and have tried to run it through both a virtual environment as well as conda (using both Python 3.7 and 3.8). This is the error I get (using conda, python 3.8):

readfish targets --device MN28703 --experiment-name "RU Test" --toml human_chr_selection.toml --log-file RU_test_09-03-2022
2022-03-09 12:18:45,668 ru.ru_gen /home/user/miniconda3/envs/readfish/bin/readfish targets --device MN28703 --experiment-name RU Test --toml human_chr_selection.toml --log-file RU_test_09-03-2022
2022-03-09 12:18:45,669 ru.ru_gen batch_size=512
2022-03-09 12:18:45,669 ru.ru_gen cache_size=512
2022-03-09 12:18:45,669 ru.ru_gen channels=[1, 512]
2022-03-09 12:18:45,669 ru.ru_gen chunk_log=None
2022-03-09 12:18:45,669 ru.ru_gen command=targets
2022-03-09 12:18:45,669 ru.ru_gen device=MN28703
2022-03-09 12:18:45,669 ru.ru_gen dry_run=False
2022-03-09 12:18:45,669 ru.ru_gen experiment_name=RU Test
2022-03-09 12:18:45,669 ru.ru_gen func=<function run at 0x7f5504d60af0>
2022-03-09 12:18:45,669 ru.ru_gen host=127.0.0.1
2022-03-09 12:18:45,669 ru.ru_gen log_file=RU_test_09-03-2022
2022-03-09 12:18:45,669 ru.ru_gen log_format=%(asctime)s %(name)s %(message)s
2022-03-09 12:18:45,669 ru.ru_gen log_level=info
2022-03-09 12:18:45,669 ru.ru_gen paf_log=None
2022-03-09 12:18:45,669 ru.ru_gen port=9501
2022-03-09 12:18:45,669 ru.ru_gen run_time=172800
2022-03-09 12:18:45,669 ru.ru_gen throttle=0.4
2022-03-09 12:18:45,669 ru.ru_gen toml=human_chr_selection.toml
2022-03-09 12:18:45,669 ru.ru_gen unblock_duration=0.1
2022-03-09 12:18:45,669 ru.ru_gen workers=1
2022-03-09 12:18:45,672 ru.ru_gen Initialising minimap2 mapper
2022-03-09 12:18:51,078 ru.ru_gen Mapper initialised
2022-03-09 12:18:51,097 ru.ru_gen This experiment has 1 region on the flowcell
2022-03-09 12:18:51,098 ru.ru_gen Using reference: /home/user/hg38 reference/Homo_sapiens_assembly38.fasta.mmi
2022-03-09 12:18:59,196 ru.ru_gen Region 'select_chr_21_22' (control=False) has 2 contigs of which 2 are in the reference. There are 4 targets (including +/- strand) representing 3.03% of the reference. Reads will be unblocked when classed as single_off or multi_off; sequenced when classed as single_on or multi_on; and polled for more data when classed as no_map or no_seq.
Traceback (most recent call last):
  File "/home/user/miniconda3/envs/readfish/bin/readfish", line 8, in <module>
    sys.exit(main())
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/cli.py", line 43, in main
    args.func(parser, args)
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/ru_gen.py", line 498, in run
    simple_analysis(
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/ru_gen.py", line 248, in simple_analysis
    for read_info, read_id, seq_len, results in mapper.map_reads_2(
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/basecall.py", line 193, in map_reads_2
    for read_info, read_id, seq, seq_len, quality in calls:
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/basecall.py", line 149, in basecall_minknow
    for read_info, data in self._basecall(*args, **kwargs):
  File "/home/user/miniconda3/envs/readfish/lib/python3.8/site-packages/ru/basecall.py", line 109, in _basecall
    r_id = r["metadata"]["read_id"]
TypeError: list indices must be integers or slices, not str

Any suggestions on how to resolve this?

Thanks so much!

Mahreen

mattloose commented 2 years ago

This is one for @alexomics

I think you need to try the https://github.com/looselab/readfish/tree/guppy_6 branch.

merfre commented 2 years ago

Hello,

I am having a very similar issue. I tried the dev branch and the guppy6 branch, but I get a type error as well when trying to go through the playback run instructions (copied below is the guppy6 branch). I tried the same solutions described in the first comment, but still get a type error.

readfish unblock-all --device MN17166 --experiment-name "Testing ReadFish Unblock All" INFO 2022-03-31 14:49:36,090 Manager /home/merfre/readfishgup6/bin/readfish unblock-all --device MN17166 --experiment-name Testing ReadFish Unblock All 2022-03-31 14:49:36,090 Manager /home/merfre/readfishgup6/bin/readfish unblock-all --device MN17166 --experiment-name Testing ReadFish Unblock All INFO 2022-03-31 14:49:36,090 Manager batch_size=512 2022-03-31 14:49:36,090 Manager batch_size=512 INFO 2022-03-31 14:49:36,090 Manager cache_size=512 2022-03-31 14:49:36,090 Manager cache_size=512 INFO 2022-03-31 14:49:36,090 Manager channels=[1, 512] 2022-03-31 14:49:36,090 Manager channels=[1, 512] INFO 2022-03-31 14:49:36,090 Manager command=unblock-all 2022-03-31 14:49:36,090 Manager command=unblock-all INFO 2022-03-31 14:49:36,090 Manager device=MN17166 2022-03-31 14:49:36,090 Manager device=MN17166 INFO 2022-03-31 14:49:36,090 Manager dry_run=False 2022-03-31 14:49:36,090 Manager dry_run=False INFO 2022-03-31 14:49:36,090 Manager experiment_name=Testing ReadFish Unblock All 2022-03-31 14:49:36,090 Manager experiment_name=Testing ReadFish Unblock All INFO 2022-03-31 14:49:36,090 Manager func=<function run at 0x7f229620cf70> 2022-03-31 14:49:36,090 Manager func=<function run at 0x7f229620cf70> INFO 2022-03-31 14:49:36,090 Manager host=127.0.0.1 2022-03-31 14:49:36,090 Manager host=127.0.0.1 INFO 2022-03-31 14:49:36,090 Manager log_file=None 2022-03-31 14:49:36,090 Manager log_file=None INFO 2022-03-31 14:49:36,090 Manager log_format=%(asctime)s %(name)s %(message)s 2022-03-31 14:49:36,090 Manager log_format=%(asctime)s %(name)s %(message)s INFO 2022-03-31 14:49:36,090 Manager log_level=info 2022-03-31 14:49:36,090 Manager log_level=info INFO 2022-03-31 14:49:36,090 Manager port=9501 2022-03-31 14:49:36,090 Manager port=9501 INFO 2022-03-31 14:49:36,090 Manager run_time=172800 2022-03-31 14:49:36,090 Manager run_time=172800 INFO 2022-03-31 14:49:36,090 Manager throttle=0.4 2022-03-31 14:49:36,090 Manager throttle=0.4 INFO 2022-03-31 14:49:36,090 Manager unblock_duration=0.1 2022-03-31 14:49:36,090 Manager unblock_duration=0.1 INFO 2022-03-31 14:49:36,090 Manager workers=1 2022-03-31 14:49:36,090 Manager workers=1 Traceback (most recent call last): File "/home/merfre/readfishgup6/bin/readfish", line 33, in sys.exit(load_entry_point('readfish==0.0.9a1', 'console_scripts', 'readfish')()) File "/home/merfre/readfishgup6/lib/python3.9/site-packages/ru/cli.py", line 43, in main args.func(parser, args) File "/home/merfre/readfishgup6/lib/python3.9/site-packages/ru/unblock_all.py", line 127, in run position = get_device(args.device, host=args.host, port=args.port) File "/home/merfre/readfishgup6/lib/python3.9/site-packages/ru/utils.py", line 918, in get_device manager = Manager(host=host, port=port, use_tls=use_tls) TypeError: init() got an unexpected keyword argument 'use_tls'

alexomics commented 2 years ago

@merfre Could you check what MinKNOW version you are using, both in the MinKNOW user interface and by running pip show minknow_api in your readfish environment?

I think that this is related to this issue: https://github.com/LooseLab/readfish/issues/187#issuecomment-1081854776

alexomics commented 2 years ago

@mahreenkn I think that your issue is related to your Guppy version. If this is still an issue can you post your environment versions, e.g MinKNOW version, guppy version, etc?

merfre commented 2 years ago

@alexomics Thank you so much for your quick reply. My minknow api version is 5.0.0.1, guppy is 6.0.6, minknow core is 5.0.0

alexomics commented 2 years ago

@merfre You will, temporarily, need to use the issue187 branch. Follow the installation instructions using the conda environment that is defined in this comment: https://github.com/LooseLab/readfish/issues/187#issuecomment-1081854776 and if you hit any other issues come back and open an issue!

mahreenkn commented 2 years ago

Hi @alexomics, thanks for the response! I'm trying to get readfish set up on a new computer, so just did a fresh MinKNOW and guppy install, and am now on MinKNOW core 5.0.0 and Guppy 6.0.6. I went through the installation guidelines you mentioned above, and this is my current list of packages:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
attrs                     21.4.0                   pypi_0    pypi
beautifulsoup4            4.10.0                   pypi_0    pypi
biopython                 1.76                     pypi_0    pypi
ca-certificates           2021.10.8            ha878542_0    conda-forge
certifi                   2021.10.8                pypi_0    pypi
charset-normalizer        2.0.12                   pypi_0    pypi
google                    3.0.0                    pypi_0    pypi
grpcio                    1.44.0                   pypi_0    pypi
idna                      3.3                      pypi_0    pypi
importlib-metadata        4.11.3                   pypi_0    pypi
importlib-resources       5.6.0                    pypi_0    pypi
jsonschema                4.4.0                    pypi_0    pypi
ld_impl_linux-64          2.36.1               hea4e1c9_2    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 11.2.0              h1d223b6_14    conda-forge
libgomp                   11.2.0              h1d223b6_14    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libstdcxx-ng              11.2.0              he4da1e4_14    conda-forge
libzlib                   1.2.11            h166bdaf_1014    conda-forge
mappy                     2.24                     pypi_0    pypi
minknow-api               5.0.0                    pypi_0    pypi
ncurses                   6.3                  h9c3ff4c_0    conda-forge
numpy                     1.17.4                   pypi_0    pypi
ont-pyguppy-client-lib    6.0.6                    pypi_0    pypi
openssl                   3.0.2                h166bdaf_1    conda-forge
packaging                 21.3                     pypi_0    pypi
pandas                    1.3.5                    pypi_0    pypi
pip                       22.0.4             pyhd8ed1ab_0    conda-forge
protobuf                  3.20.0                   pypi_0    pypi
pyparsing                 3.0.7                    pypi_0    pypi
pyrfc3339                 1.1                      pypi_0    pypi
pyrsistent                0.18.1                   pypi_0    pypi
python                    3.7.12          hf930737_100_cpython    conda-forge
python-dateutil           2.8.2                    pypi_0    pypi
python_abi                3.7                     2_cp37m    conda-forge
pytz                      2022.1                   pypi_0    pypi
read-until                3.0.0                    pypi_0    pypi
readfish                  0.0.9a3                  pypi_0    pypi
readline                  8.1                  h46c0cb4_0    conda-forge
requests                  2.27.1                   pypi_0    pypi
setuptools                62.0.0           py37h89c1867_0    conda-forge
six                       1.16.0                   pypi_0    pypi
soupsieve                 2.3.1                    pypi_0    pypi
sqlite                    3.37.1               h4ff8645_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
toml                      0.10.2                   pypi_0    pypi
typing-extensions         4.1.1                    pypi_0    pypi
urllib3                   1.26.9                   pypi_0    pypi
watchdog                  2.1.7                    pypi_0    pypi
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
zipp                      3.8.0                    pypi_0    pypi
zlib                      1.2.11            h166bdaf_1014    conda-forge

readfish validate works as expected, but I now get the following error when trying the readfish targets command:

readfish targets --device MN28703 --experiment-name "RU Test" --toml human_chr_selection.toml --log-file ru_test.log
2022-04-05 16:16:45,303 ru.ru_gen /home/mahreen/miniconda3/envs/readfish/bin/readfish targets --device MN28703 --experiment-name RU Test --toml human_chr_selection.toml --log-file ru_test.log
2022-04-05 16:16:45,303 ru.ru_gen batch_size=512
2022-04-05 16:16:45,303 ru.ru_gen cache_size=512
2022-04-05 16:16:45,303 ru.ru_gen channels=[1, 512]
2022-04-05 16:16:45,303 ru.ru_gen chunk_log=None
2022-04-05 16:16:45,303 ru.ru_gen command=targets
2022-04-05 16:16:45,303 ru.ru_gen device=MN28703
2022-04-05 16:16:45,303 ru.ru_gen dry_run=False
2022-04-05 16:16:45,303 ru.ru_gen experiment_name=RU Test
2022-04-05 16:16:45,303 ru.ru_gen func=<function run at 0x7f7b6b072cb0>
2022-04-05 16:16:45,303 ru.ru_gen host=127.0.0.1
2022-04-05 16:16:45,303 ru.ru_gen log_file=ru_test.log
2022-04-05 16:16:45,303 ru.ru_gen log_format=%(asctime)s %(name)s %(message)s
2022-04-05 16:16:45,303 ru.ru_gen log_level=info
2022-04-05 16:16:45,303 ru.ru_gen paf_log=None
2022-04-05 16:16:45,303 ru.ru_gen port=None
2022-04-05 16:16:45,303 ru.ru_gen run_time=172800
2022-04-05 16:16:45,303 ru.ru_gen throttle=0.4
2022-04-05 16:16:45,303 ru.ru_gen toml=human_chr_selection.toml
2022-04-05 16:16:45,303 ru.ru_gen unblock_duration=0.1
2022-04-05 16:16:45,303 ru.ru_gen workers=1
2022-04-05 16:16:45,307 ru.ru_gen Initialising minimap2 mapper
2022-04-05 16:16:51,639 ru.ru_gen Mapper initialised
2022-04-05 16:16:51,774 ru.ru_gen This experiment has 1 region on the flowcell
2022-04-05 16:16:51,775 ru.ru_gen Using reference: /home/mahreen/hg38_fullref.mmi
2022-04-05 16:17:01,721 ru.ru_gen Region 'select_chr_21_22' (control=False) has 2 contigs of which 2 are in the reference. There are 4 targets (including +/- strand) representing 3.04% of the reference. Reads will be unblocked when classed as single_off or multi_off; sequenced when classed as single_on or multi_on; and polled for more data when classed as no_map or no_seq.
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
[guppy/error] basecall_service::BasecallClient::worker_loop: Connection error. [failed] zmq::error_t : Invalid argument
Traceback (most recent call last):
  File "/home/mahreen/miniconda3/envs/readfish/bin/readfish", line 8, in <module>
    sys.exit(main())
  File "/home/mahreen/miniconda3/envs/readfish/lib/python3.7/site-packages/ru/cli.py", line 43, in main
    args.func(parser, args)
  File "/home/mahreen/miniconda3/envs/readfish/lib/python3.7/site-packages/ru/ru_gen.py", line 510, in run
    caller_kwargs=caller_kwargs,
  File "/home/mahreen/miniconda3/envs/readfish/lib/python3.7/site-packages/ru/ru_gen.py", line 157, in simple_analysis
    config=caller_kwargs["config_name"],
  File "/home/mahreen/miniconda3/envs/readfish/lib/python3.7/site-packages/ru/basecall.py", line 41, in __init__
    self.connect()
  File "/home/mahreen/miniconda3/envs/readfish/lib/python3.7/site-packages/pyguppy_client_lib/pyclient.py", line 165, in connect
    return_code, self.get_error_message()
ConnectionError: Could not connect. Is the server running? Check your connection parameters. <result.failed: 12> : [failed] zmq::error_t : Invalid argument

Guppy basecalling should be working fine, and I've tested basecalling with the client which also works. Also confirmed that the Guppy server is running on the GPU as expected!

alexomics commented 2 years ago

What are your caller_settings in your TOML file?

Can you try this snippet from this comment https://github.com/LooseLab/readfish/issues/170#issuecomment-1032566729 using the port that guppy should be active on?

mahreenkn commented 2 years ago

@alexomics

I think it might have to do with the caller settings. I've changed my Guppy settings around to use tcp on port 5555; this was the only way I could properly set up GPU basecalling and get ONT's built-in adaptive sampling working, but if needed I can reinstall Guppy/MinKNOW without making those changes or just revert to my old Guppy config. Although, I did specify what I assume to be the correct host and port in the readfish TOML file.

[caller_settings]
config_name = "dna_r9.4.1_450bps_fast"
host = "127.0.0.1"
port = 5555

[conditions]
reference = "/home/mahreen/hg38_fullref.mmi"

[conditions.0]
name = "select_chr_21_22"
control = false
min_chunks = 0
max_chunks = inf
targets = ["chr21", "chr22"]
single_on = "stop_receiving"
multi_on = "stop_receiving"
single_off = "unblock"
multi_off = "unblock"
no_seq = "proceed"
no_map = "proceed"

Running python -c 'from pyguppy_client_lib.pyclient import PyGuppyClient as PGC; c = PGC("127.0.0.1:5555", "dna_r9.4.1_450bps_fast.cfg"); c.connect(); print(c)' gives me the following:

PyGuppyClient(address='127.0.0.1:5555', config='dna_r9.4.1_450bps_fast', align_ref=None, bed_file=None, barcodes=None, status.connected, )
github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 1 year ago

This issue was closed because there has been no response for 5 days after becoming stale.