LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
169 stars 33 forks source link

Optimising read until on miniT #43

Closed speleonut closed 4 years ago

speleonut commented 4 years ago

Hi Alex, Matt & team, We have a miniT that runs our minion and inspired by https://github.com/LooseLab/ru/issues/28 I had a go at installing and running ru on it. It works, right up to the basecalling & mapping and then I am having "performance issues". See my mapping times below. Not sure if it is worth continuing or if this is never going to work. Happy to provide additional logs or info. Cheers, Mark

2020-04-30 18:34:19,771 ru.ru_gen 74R/1.25966s
2020-04-30 18:34:23,165 ru.ru_gen 148R/3.39248s
2020-04-30 18:34:28,704 ru.ru_gen 174R/5.53804s
2020-04-30 18:34:39,260 ru.ru_gen 247R/10.55593s
2020-04-30 18:34:45,618 ru.ru_gen 241R/6.34544s
2020-04-30 18:34:55,137 ru.ru_gen 242R/9.51814s
2020-04-30 18:35:10,468 ru.ru_gen 248R/15.33125s
2020-04-30 18:35:27,624 ru.ru_gen 257R/17.14404s
2020-04-30 18:35:46,549 ru.ru_gen 350R/18.92046s
2020-04-30 18:36:10,766 ru.ru_gen 346R/24.21621s
2020-04-30 18:36:39,124 ru.ru_gen 341R/28.34917s
2020-04-30 18:37:15,932 ru.ru_gen 355R/36.79682s
2020-04-30 18:38:03,242 ru.ru_gen 393R/47.28662s
2020-04-30 18:39:07,453 ru.ru_gen 409R/64.18478s
2020-04-30 18:40:24,991 ru.ru_gen 410R/77.52404s
2020-04-30 18:41:54,450 ru.ru_gen 421R/89.44561s
2020-04-30 18:43:21,877 ru.ru_gen 426R/87.40666s

For interest and info, I have solved a few installation traps on the miniT. No guarantees are given or implied.

sudo apt update
sudo apt upgrade # If required
sudo apt install python3-venv python3-dev libzmq3-dev libhdf5-dev screen
# Fetch aarch64 binary version of guppy basecaller >3.4 from Oxford Nanopore. eg.
wget https://mirror.oxfordnanoportal.com/software/analysis/ont-guppy_3.5.2_linuxaarch64.tar.gz
tar -xvf ont-guppy_3.5.2_linuxaarch64.tar.gz
python3 -m venv read_until
. ./read_until/bin/activate
pip install --upgrade pip
pip install git+https://github.com/LooseLab/read_until_api_v2@master
# Install Cython and h5py separately, limiting the version of h5py or you will get: AttributeError: module 'h5py.h5pl' has no attribute 'prepend'
# You can compile h5py 2.10.0 with HDF5 v1.8.4 just fine but it won't include the h5pl plugin attribute unless your HDF5 version is 1.10+ vis http://api.h5py.org/h5pl.html (this one got me good!).
pip install Cython
pip install h5py==2.9.0
pip install git+https://github.com/LooseLab/ru@master
# Install passed the first test at this stage
ru_generators

To get the read until working I started a new guppy server as suggested for the gridION. I got the settings to use from the existing guppy 3.2 logs on the miniT. I can't work out how to set "num socket threads" which is 1 in my logs but defaults to 2. The default of runners per device is 8.

screen sudo /opt/ont-guppy/bin/guppy_basecall_server \
--config /opt/ont-guppy/data/dna_r9.4.1_450bps_fast.cfg \
--log_path /var/log/ont/guppy --port 5556 --device cuda:all \
--chunk_size 1000 --chunks_per_runner 48 \
--num_callers 1
mattloose commented 4 years ago

Hi Mark,

Thanks for sharing all of these notes - in particular the solutions to many of the installation issues. We aren't recommending MinIT for this purpose at present as we found (similar to you) that it couldn't keep up with the workflow.

My suspicion is that although fast basecalling is sufficient for calling on the MinIT, the total workflow here outcompetes the capacity of the MinIT in real time (disk IO, CPU etc). This may be optimisable but it would need deeper integration into MinKNOW than we have currently available.

I presume you were running from playback here?

Best

Matt

speleonut commented 4 years ago

HI Matt, Yes just the playback example you have in your readme (an excellent guide btw). Shame about the miniT, on paper it looked feasible that it might handle it. At least if someone is stuck with ubuntu 16.04 for some reason they might benefit from these notes. I did wonder if part of the problem is because it is effectively running two basecall servers at once (3.2 through minknow). Looks like we're shopping for a new gfx card today! Thanks for the quick response, we're super excited by your innovation and determined to get it to go. Congrats on the good work. Mark

ps-account commented 4 years ago

Do you have access to two minits maybe? Could use a second one as a dedicated basecalling server?

speleonut commented 4 years ago

We went with a new computer with an RTX2080 Super. ru works great on that (...or it did until the recent update of minknow broke everything).

mattloose commented 4 years ago

Just a comment to say that we will be updating readfish shortly to work with the new MinKNOW. 4.0 release.

speleonut commented 4 years ago

Thanks Matt, great to hear. I’ve been in touch with ONT to try to get a copy of the old release of minknow. They’ve been pretty good about it so far. I’m not a fantastic programmer but will be happy to help / test when you’re ready.