LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
168 stars 33 forks source link

Question: Correct configuration for switching off barcodes #367

Closed philipschoettler closed 3 weeks ago

philipschoettler commented 1 month ago

We try using readfish to switch off barcodes after they collected enough data, but somehow we see no effect. We are also not sure if the configuration file is set up correctly. Can you provide an example config.toml file for this case? We don't try to enrich or deplete for specific targets, just the barcode.

Our current configuration, trying to switch off barcode08, looks like this:

[caller_settings.dorado]
config = "dna_r10.4.1_e8.2_400bps_5khz_fast.cfg"
address = "ipc:///tmp/.guppy/5555"
barcode_kits = "SQK-RPB114-24"

[mapper_settings.no_op]

[barcodes.classified]
name = "classified_reads"
control = false
min_chunks = 0
max_chunks = 2
targets = []
single_on = "stop_receiving"
multi_on = "stop_receiving"
single_off = "stop_receiving"
multi_off = "stop_receiving"
no_seq = "proceed"
no_map = "proceed"
above_max_chunks = "proceed"

[barcodes.unclassified]
name = "unclassified_reads"
control = false
min_chunks = 0
max_chunks = 2
targets = []
single_on = "stop_receiving"
multi_on = "stop_receiving"
single_off = "stop_receiving"
multi_off = "stop_receiving"
no_seq = "proceed"
no_map = "proceed"
above_max_chunks = "proceed"

[barcodes.barcode08]
name = "barcode08_reads"
control = false
min_chunks = 0
max_chunks = 2
targets = []
single_on = "unblock"
multi_on = "unblock"
single_off = "unblock"
multi_off = "unblock"
no_seq = "unblock"
no_map = "unblock"
above_max_chunks = "unblock"
A sample section of the logs looks like this: client_iteration read_in_loop read_id channel read_number seq_len counter mode decision condition barcode previous_action action_override timestamp
12732 35 890ccb66-d9e7-4276-9a44-d7da6f4df413 126 8057 811 2 no_map proceed unclassified_reads unclassified unblock False 1722350395.6428206
12732 36 4720c6ef-f056-46d6-897a-9b2393bf5950 220 5003 360 1 no_map proceed classified_reads barcode02 unblock False 1722350395.6429107
12732 37 7d273f2d-5d2b-471e-abce-9a699f9de0a7 34 9170 351 1 no_map proceed classified_reads barcode04 unblock False 1722350395.6430104
12732 38 821b1ff3-ed24-4353-99df-7194efc22941 179 9489 959 3 above_max_chunks proceed unclassified_reads unclassified unblock False 1722350395.643098
12732 39 ef4e5f5b-f4dc-442f-8fdf-80a478694fe0 64 1461 365 1 no_map proceed classified_reads barcode02 unblock False 1722350395.6431863
12732 40 02939890-8880-4c3c-b419-99379a52747b 199 13333 512 2 no_map unblock barcode08_reads barcode08 unblock False 1722350395.6432767
12732 41 0d01c2ec-3611-45f3-bfd1-00c57425113b 236 7798 772 2 no_map unblock barcode08_reads barcode08 unblock False 1722350395.643367
12732 42 fa6e0df9-3d7f-4e51-90f7-972bba2d52de 147 8396 340 1 no_map unblock barcode08_reads barcode08 unblock False 1722350395.643456

Although the log file seems to match what we want to achieve, we don't see any effect on the actual experiment.

github-actions[bot] commented 1 month ago

Thank you for your issue. Give us a little time to review it.

PS. You might want to check the FAQ if you haven't done so already.

This is an automated reply, generated by FAQtory

mattloose commented 1 month ago

Hi,

Sorry that this is causing problems. It looks from above as though you have everything correctly configured. What do your read length distributions look like for your barcodes?

If you look in MinKNOW at the read length proifle and "split by end reason" do you see any adaptive sampling read unblocked reads?

Thanks!

philipschoettler commented 1 month ago

Thanks for the quick reply! :) The average read length of barcode08 didn‘t change significantly after we switched it off. I will be able to check the other stats you mentioned tomorrow.

mattloose commented 1 month ago

Ah sorry I wasn't clear. What was the average length of reads on barcode 08 before you used adaptive sampling? Mean and median would be interesting to check.

philipschoettler commented 1 month ago

I currently don't have access to the MinKNOW UI, but here is what I can provide so far:

Average Read Length for barcode 8

Read length distribution from run report:

read-length-distribution

Also, we had about 650 "WARNING:root:Could not send read to Dorado" log entries. I assume that's not the main problem compared to a total of 1.66 M reads, right?

dawnmy commented 1 month ago

we also enabled the adaptive sequencing option in MINKNOW UI to deplete human sequences using human genome as the reference. Not sure whether these two different adaptive sequencing tasks might interfere with each other.

mattloose commented 1 month ago

Hmm - I do not know what would happen if you have two different processes running and I definitely would not suggest doing that. It could mean conflicting results being sent for inidividual reads. I suspect that this is the cause of the issue.

What will likely have happened is that you will have sent a command to keep a read sequencing beause it doesn't map to the human genome and this will override the request to unblock the read based on the barcode.

There is a way that you can set up readfish alone to do barcode specific AND handle deplete human genomes (I think....)

But please don't run two conflicting instances at the same time!

philipschoettler commented 1 month ago

All right, thanks for your help! We will test the configuration next week without using MinKNOW's adaptive sampling at the same time.

philipschoettler commented 1 month ago

Hello again for a small update :) With native adaptive sampling switched off, we could now see a 50% reduction in read length. Afterwards we saw that there was a drop in length even with native adaptive sampling enabled, but only sometimes. We had several phases in which one barcode should have been switched off, but only in one of these phases was there a change in average read length. So the combination of native adaptive sampling and readfish indeed seems to be somewhat unpredictable.