LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
167 stars 31 forks source link

Playback issue from own custom sequencing run #214

Closed StevenVerbruggen closed 1 year ago

StevenVerbruggen commented 1 year ago

Hi Matt and Alex,

I was trying to play back a run we performed earlier on our GridION. Info from the initial run was recorded following the advice in https://github.com/LooseLab/readfish/issues/194

This is a screenshot from the output options selected during recording: image

As you can see in the screenshot, we did not performed live basecalling during the recorded run. Is this important for successfully playbacking the run as a simulation run?

Pore occupancy of the recorded live run looked very good. image

When we perform the playback simulation run, a lot of pores are unavailable (see screenshot). During the first pore scan phase, things look more or less normal still, but when the run starts the first active sequencing phase, only 14 pores are available. I let it also run for 15 mins and this stayed the same. image

Any settings or options I forgot or misconfigured during either recording or playback run?

Tool versions: MinKNOW 22.08.6 GridION base 6.3.0 Guppy 6.2.7

Best and thanks already for your advice, Steven

mattloose commented 1 year ago

Interesting.

It looks to me as though your playback run is not playing the bulk file. Are you running with a real flowcell or a configuration test cell in your device?

There is nothing in what you have done above that appears to be incorrect. So - could you try putting a configuration test cell in your device (and not a "real" flowcell) and then repeating your playback experiment?

Also double check that you have added the bulkfile to your edited sequencing run script.

Finally - you might need to make sure the runscripts have been reloaded before tyring to launch playback.

I hope this helps...

StevenVerbruggen commented 1 year ago

Hi Matt,

The playback run is performed on a white config test cell.

The recorded bulk output is present in this file on our GridION: /data/playback_files/GXB04113_20221017_1632_FAU65822_X2_sequencing_run_promega_ed47807e_barcoded.fast5

This is the portion of the sequencing config toml script that is used during the simulation run:

# basic_settings #
[custom_settings]
simulation="/data/playback_files/GXB04113_20221017_1632_FAU65822_X2_sequencing_run_promega_ed47807e_barcoded.fast5"
enable_relative_unblock_voltage = true
unblock_voltage_gap = 480
run_time = 172800 # (seconds) 1hr=3600
start_bias_voltage = -185
# UI parameters
translocation_speed_min = 300
translocation_speed_max = 425

I reloaded config scripts once more in MinKNOW and started a playback run again, same low number of available pores.

By the way, I also just tried a playback run with the bulk fast5 you provide for download in the ReadFish manual here on GitHub. Seems to work just fine, just like always, so I suppose it is linked to the recording sequencing run. In our recorded run, 3 internal control samples were prepared with lib prep kit LSK-109 in combination with PCR-free native barcode kit EXPNBD-114. Samples were given the barcodes 16, 17 and 18. The sequencing run was performed with default options, default MIN106_DNA sequencing config, no live basecalling performed, and additional bulk output as shown above.

Final goal is to have a simulation on which we can test barcoded Readfish options, before performing Readfish on real live barcoded runs. If you would perhaps have a recorded bulk file of a barcoded run which you are free to share, this could maybe help us out for now as well (we can provide an SFTP share solution if necessary). No problem if you have nothing at hand though, then I will try out the recording further.

Thanks for your help. Sounds like a difficult one to crack, this one.

Cheers, Steven

mattloose commented 1 year ago

Ooh - this is an interesting problem.

I cannot see anything wrong in what you've done.

You've also tested our original bulkfile and found that that does still work (which would have been my next suggestion!).

Are you able to transfer the bulkfile to us in any way? It may be too large of course.

One thing I do note is that the bulkfile has an unexpected name - where has the _barcoded come from?

StevenVerbruggen commented 1 year ago

The '_barcoded' part is something I added to the file name to better identify the file. I tried to remove that part from the name and rename the file to its original name (as when it came out of sequencing), I also put it back onto its original location. Simulation trial with the rename did not lead to improvement.

I uploaded our bulkfile through our WeTransfer account: https://we.tl/t-MbB79rgp5y Let me know if I can provide any other info that would help to track down this issue

mattloose commented 1 year ago

Hi - I've tried downloading that bulkfile - it appears that it is corrupt - at least, I can't read some of the signal channels from it using hdf5view. It fails on playback when I try here - so I'm afraid that there isn't much I can do to help further.

We typically only record a short (2 hours or so) bulkfile when we are doing these sorts of experiments.