LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
163 stars 31 forks source link

Readfish intermittently crashes during barcoded experiment #329

Open jamesemery opened 5 months ago

jamesemery commented 5 months ago

I have noticed an intermittent failure in my experiments using Readfish + Barcode-Dependent target lists that seems to trigger between 0-3 times per 72 hour sequencing experiment. Always Readfish will happily restart and usually run to completion if i re-launch from the command line and thus I can work around this failure for now but it seems like a genuine bug. These experiments all have between 3 and 7 barcodes loaded on the sequencer and I have configured each of those barcodes to target independent lists (with no region splitting at all). The exceptions always take the following form ("channel=####" varies):

Traceback (most recent call last):
  File "/home/prom/miniconda3/envs/readfish2/bin/readfish", line 8, in <module>
    sys.exit(main())
  File "/home/prom/miniconda3/envs/readfish2/lib/python3.10/site-packages/readfish/_cli_base.py", line 61, in main
    raise SystemExit(args.func(parser, args, extras))
  File "/home/prom/miniconda3/envs/readfish2/lib/python3.10/site-packages/readfish/entry_points/targets.py", line 537, in run
    worker.run()
  File "/home/prom/miniconda3/envs/readfish2/lib/python3.10/site-packages/readfish/entry_points/targets.py", line 392, in run
    control, condition = self.conf.get_conditions(
  File "/home/prom/miniconda3/envs/readfish2/lib/python3.10/site-packages/readfish/_config.py", line 321, in get_conditions
    raise ValueError(
ValueError: Both region (channel=2806) and barcode (None) were not found. This config is invalid!

MinKNOW is set to 23.03.5(focal). My pip-list for the environment I am running is as follows:

---------------------- ------------
about-time             4.2.1
alive-progress         3.1.4
attrs                  23.1.0
cattrs                 23.1.2
click                  8.1.7
exceptiongroup         1.1.3
grapheme               0.6.0
grpcio                 1.59.0
iniconfig              2.0.0
mappy                  2.26
mappy-rs               0.0.7
minknow-api            5.7.2
more-itertools         10.1.0
numpy                  1.26.0
ont-pyguppy-client-lib 6.5.7
packaging              23.2
pip                    23.2.1
pluggy                 1.3.0
protobuf               4.24.4
pyRFC3339              1.1
pytest                 7.4.2
pytz                   2023.3.post1
readfish               2023.1.1
readfish_summarise     0.2.5
rtoml                  0.9.0
setuptools             68.2.2
tomli                  2.0.1
typing_extensions      4.8.0
wheel                  0.41.2

I should note, I have only noticed this exception cropping up in the past month or so after this exact installation was working without incident last December. I have not intentionally updated any of my software in that time but it is certainly possible that MinKNOW was bumped to an incorrect version somehow.

github-actions[bot] commented 5 months ago

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

mattloose commented 5 months ago

Hi James,

This is a suprising problem - we can have a look at this for you - would you mind sharing your toml file with us so we can have a look to better understand?

We've never seen this issue.

Feel free to delete targets etc if you don't want those to be shared.

Matt

jamesemery commented 5 months ago

Here you go:

1 [caller_settings.guppy]
  2 config = "dna_r10.4.1_e8.2_400bps_5khz_fast_prom.cfg"
  3 address = "ipc:///home/prom/5556"
  4 barcode_kits = "SQK-NBD114-24"
  5 
  6 [mapper_settings.mappy_rs]
  7 fn_idx_in = "/data/hg38.mmi"
  8 n_threads = 4
  9 
 10 [barcodes.unclassified]
 11 name = "unclassified_reads"
 12 control = false
 13 min_chunks = 0
 14 max_chunks = 12
 15 targets = []
 16 single_on = "unblock"
 17 multi_on = "unblock"
 18 single_off = "unblock"
 19 multi_off = "unblock"
 20 no_seq = "proceed"
 21 no_map = "proceed"
 22 
 23 [barcodes.classified]
 24 name = "classified_reads"
 25 control = false
 26 min_chunks = 0
 27 max_chunks = 12
 28 targets = []
 29 single_on = "unblock"
 30 multi_on = "unblock"
 31 single_off = "unblock"
 32 multi_off = "unblock"
 33 no_seq = "proceed"
 34 no_map = "proceed"35 
 35
 36 [barcodes.barcode16]
 37 name = "########"
 38 control = false
 39 min_chunks = 0
 40 max_chunks = 4
 41 targets = "########"
 42 single_on = "stop_receiving"
 43 multi_on = "stop_receiving"
 44 single_off = "unblock"
 45 multi_off = "unblock"
 46 no_seq = "proceed"
 47 no_map = "proceed"
 48 
 49 [barcodes.barcode17]
 50 name = "########"
 51 control = false
 52 min_chunks = 0
 53 max_chunks = 4
 54 targets = "########"
 55 single_on = "stop_receiving"
 56 multi_on = "stop_receiving"
 57 single_off = "unblock"
 58 multi_off = "unblock"
 59 no_seq = "proceed"
 60 no_map = "proceed"
 61 
 62 [barcodes.barcode18]
 63 name = "########"
 64 control = false
 65 min_chunks = 0
 66 max_chunks = 4
 67 targets = "########"
 68 single_on = "stop_receiving"
 69 multi_on = "stop_receiving"
 70 single_off = "unblock"
 71 multi_off = "unblock"
 72 no_seq = "proceed"
 73 no_map = "proceed"
 74 
 75 
 76 [barcodes.barcode09]
 77 name = "########"
 78 control = false
 79 min_chunks = 0
 80 max_chunks = 4
 81 targets = "########"
 82 single_on = "stop_receiving"
 83 multi_on = "stop_receiving"
 84 single_off = "unblock"
 85 multi_off = "unblock"
 86 no_seq = "proceed"
 87 no_map = "proceed"
 88 
 89 [barcodes.barcode12]
 90 name = "########"
 91 control = false
 92 min_chunks = 0
 93 max_chunks = 4
 94 targets = "########"
 95 single_on = "stop_receiving"
 96 multi_on = "stop_receiving"
 97 single_off = "unblock"
 98 multi_off = "unblock"
 99 no_seq = "proceed"
100 no_map = "proceed"
mattloose commented 5 months ago

Thanks - we'll have a look at this and get back to you.

I can't immediately see anything wrong in your toml file which is good.

mattloose commented 5 months ago

OK - we've discussed this here.

We think what is happening is that dorado is occasionally returning a read without a barcoded assignment. This is unexpected by readfish and so it is failing. We can implement a workaround (and will do so) but we will also alert the dorado team. We will update here when a workaround is implemented in readfish, but hopefully the upstreamt dorado behaviour will also be fixed.

A possible workaround for now would be to add an extra region to your toml file:

[[regions]]
name = "no_barcode_reads"
control = false
min_chunks = 0
max_chunks = 12
targets = []
single_on = "unblock"
multi_on = "unblock"
single_off = "unblock"
multi_off = "unblock"
no_seq = "proceed"
no_map = "proceed"

The current bug is caused by the base caller returning a read without a barcode, but readfish is expecting all reads to have a barcode or be "unclassified" by the base caller. In this case you are getting a read with no barcode but readfish has no region to assign it too. Adding in this condition will treat the read as an unclassified read and so it will be rejected.

I have to say this suggestion is posted un tested - so please proceed with caution and we will update.