Psy-Fer / interARTIC

InterARTIC - An interactive local web application for viral whole genome sequencing utilising the artic network pipelines..
https://psy-fer.github.io/interARTIC/
MIT License
30 stars 7 forks source link

interARTIC for aarch64 (Jetson Xavier NX) #48

Closed hanfan1803 closed 3 years ago

hanfan1803 commented 3 years ago

Does interARTIC support aarch64 system?

I would manually config dependencies (e.g. list of dependencies with specific version) and source of interARTIC.

It would be great if you could build a docker for interARTIC.

Thanks Han

Psy-Fer commented 3 years ago

Hey @hasindu2008 what do you think?

@hanfan1803 we can try it out, but the conda/snakeballs could get a bit messy.

hanfan1803 commented 3 years ago

yeah, I have to install miniforge-pypy3 instead of common anaconda3 and a lot packages on Anaconda, particularly bioconda, do not support aarch64 :D.

hasindu2008 commented 3 years ago

At the moment interARTIC does not support aarch64. I explored the possibility some time ago, but never pursued it as it is time-consuming.

Direct dependencies for interARTIC celery==4.4.6 redis==3.5.3 flask==1.1.2 are available through pypi. redis-server==6.0.9 is unavailable, but compiling redis-server is from the source is possible. pandas==1.2.4 is not available, instead 1.1.X and I am sure @Psy-Fer can make interARTIC compatible with that pandas version. So, interARTIC alone is doable.

The headache comes when installing the artic pipeline which has a dozen of dependencies and each dependency having a dozen dependencies again. The dependencies of artic (fieldbioinformatics) are https://github.com/artic-network/fieldbioinformatics/blob/master/environment.yml.

Support for aarch64 is not impossible, but require a lot of effort. If there is a big demand for interARTIC on aarch64 I will be happy to do this.

@hanfan1803

hanfan1803 commented 3 years ago

@hasindu2008 It's good to know, interARTIC could run on a aarch64

-artic-porechop=0.3.2pre could work along with rampart 1.2.0 in the Xavier NX

Thanks Han

hanfan1803 commented 3 years ago

@hasindu2008 About Dockerfile, I have added wget into to get it work: FROM ubuntu:16.04 WORKDIR / RUN apt-get update \ && apt-get install -y wget \ && rm -rf /var/lib/apt/lists/* RUN wget https://github.com/Psy-Fer/interARTIC/releases/download/v0.3/interartic-v0.3-linux-x86-64-binaries.tar.gz -O interartic_bin.tar.gz RUN tar xf interartic_bin.tar.gz WORKDIR /interartic_bin CMD ./run.sh

But when I run this docker: sudo docker run getting-started

I got this message: bash: cannot set terminal process group (1): Inappropriate ioctl for device bash: no job control in this shell Starting redis server on port 7777. Log location: /interartic_bin/redis.log

Killing all processes. Launching redis server on port 7777 failed. See /interartic_bin/redis.log Starting interartic on 127.0.0.1:5000. Log location: /interartic_bin/interartic.log Launching interartic on 127.0.0.1:5000 failed. See /interartic_bin/interartic.log

Killing all processes. Starting celery. Log location: /interartic_bin/celery.log

Killing all processes. Launching celery failed. See /interartic_bin/celery.log

InterARTIC is now running on your machine :) To launch InterARTIC web interface visit http://127.0.0.1:5000 on your browser To keep your InterARTIC active this terminal must remain open. To terminate InterARTIC type CTRL-C or close the terminal.

hasindu2008 commented 3 years ago

Could you please send those three logs /interartic_bin/redis.log, /interartic_bin/interartic.log and /interartic_bin/celery.log

hasindu2008 commented 3 years ago

On my ARM Xavier I can get the interARTIC installed as follows (without artic :D)

python3 -m venv interARTIC-venv
source interARTIC-venv/bin/activate
pip install --upgrade pip
pip install celery==4.4.6 redis==3.5.3 flask==1.1.2 pandas

git clone https://github.com/tthnguyen11/interARTIC.git
cd interARTIC 
./run-redis.sh 7777

However, installing artic seems to be a nightmare.

hasindu2008 commented 3 years ago

@hanfan1803 Could you see if you can get the following installed using miniforge you mentioned

artic-porechop==0.3.2pre artic-tools==0.2.6 longshot=0.4.1 medaka=1.0.3 multiqc muscle=3.8

hanfan1803 commented 3 years ago

Could you please send those three logs /interartic_bin/redis.log, /interartic_bin/interartic.log and /interartic_bin/celery.log

I do not know to get generated files after running docker image ...Do you know?

hasindu2008 commented 3 years ago

Havn;t used docker before much. But there should be a way to pull files from inside the image. By the way, hope you are using the docker-multiarch. The binaries provided are for x86 and if you are to run them on ARM through docker, you need the docker-multiarch that integrates qemu to emulate x86. Even if it works, will be very slow I believe.

hasindu2008 commented 3 years ago

Anyway, using miniforge-pypy you can get the following dependencies of artic

conda create -n artic-ncov2019 -c conda-forge -c bioconda python=3.6

pip install clint==0.5.1
pip install biopython==1.76
pip install pytest
pip install tqdm
pip install pyvcf==0.6.8
pip install pysam==0.16.0.1

You can compile minimap2, samtools as follows


https://github.com/lh3/minimap2/releases/download/v2.17/minimap2-2.17.tar.bz2
tar xf minimap2-2.17.tar.bz2
cd minimap2-2.17/
make arm_neon=1 aarch64=1

wget https://github.com/samtools/htslib/releases/download/1.10.2/htslib-1.10.2.tar.bz2
tar xf htslib-1.10.2.tar.bz2
cd htslib-1.10.2/
./configure
make -j8

wget https://github.com/samtools/samtools/releases/download/1.10/samtools-1.10.tar.bz2
tar xf samtools-1.10.tar.bz2 
 ./configure  --without-curses
make -j8

wget https://github.com/samtools/bcftools/releases/download/1.10.2/bcftools-1.10.2.tar.bz2
tar xf bcftools-1.10.2.tar.bz2
cd bcftools-1.10.2/
./configure
make -j8

Nanopolish and bwa should work with similar steps.

This is what is left to be figured out:

  - artic-porechop==0.3.2pre
  - artic-tools==0.2.6
  - longshot=0.4.1
  - medaka=1.0.3
  - multiqc
  - muscle=3.8
  - pandas=0.23.0  
hanfan1803 commented 3 years ago
hanfan1803 commented 3 years ago

I got issue at cargo install --path . step. It said that I should compile rust-htslib v0.26.1

rust-htslib (https://github.com/rust-bio/rust-htslib/releases/tag/v0.26.1). I dont know how to install rust-htslib with cargo.toml.

hasindu2008 commented 3 years ago

Perhaps, you should try to get the nanopolish pipeline running first. That does not need longshot and medaka I believe.

@Psy-Fer is multiqc really used inside artic?

By the way getting muscle compiled is also easy:

mkdir muscle && cd muscle
wget http://www.drive5.com/muscle/muscle_src_3.8.1551.tar.gz
tar xf muscle_src_3.8.1551.tar.gz
make -j8
hanfan1803 commented 3 years ago

I will try nanopolish pipeline first, but GPU on Xavier is it's advantage, if I could use medaka I guess I could shorten processing time.

hasindu2008 commented 3 years ago

@hanfan1803 Medaka in its variant calling mode does not use the GPU anyway I think. @Psy-Fer can confirm this.

hanfan1803 commented 3 years ago

Regarding bwa 0.7.17, I followed instruction but have errors:

gcc -c -g -Wall -Wno-unused-function -O2 -DHAVE_PTHREAD -DUSE_MALLOC_WRAPPERS ksw.c -o ksw.o ksw.c:29:10: fatal error: emmintrin.h: No such file or directory

include

      ^~~~~~~~~~~~~

compilation terminated. Makefile:25: recipe for target 'ksw.o' failed make: *** [ksw.o] Error 1

hasindu2008 commented 3 years ago

ahh right You need to get sse2neon from https://github.com/lh3/minimap2 and put it inside the bwa directory then add something like this to the makefile https://github.com/lh3/minimap2/blob/52dbd439bcd6bb08b428aed1c0f9b778af04d78e/Makefile#L11

I think artic uses minimap2 by default, so you would not need BWA perhaps. For the nanopolish pipeline, the log was like this:

nanopolish index -s /mnt/d/genome/data/ebola/ebola/20190830_1509_MN22126_AAQ411_9efc5448/sequencing_summary.txt -d /mnt/d/genome/data/ebola/ebola/20190830_1509_MN22126_AAQ411_9efc5448/fast5_pass ./ebola-nanopolish_fastq_pass-NB04.fastq 5.508542800000214
minimap2 -a -x map-ont -t 4 /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta ./ebola-nanopolish_fastq_pass-NB04.fastq | samtools view -bS -F 4 - | samtools sort -o ebola-nanopolish_ebola_02_NB04.sorted.bam -  0.6813094999997702
samtools index ebola-nanopolish_ebola_02_NB04.sorted.bam    0.06116240000028483
align_trim --normalise 200 /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.scheme.bed --start --remove-incorrect-pairs --report ebola-nanopolish_ebola_02_NB04.alignreport.txt < ebola-nanopolish_ebola_02_NB04.sorted.bam 2> ebola-nanopolish_ebola_02_NB04.alignreport.er | samtools sort -T ebola-nanopolish_ebola_02_NB04 - -o ebola-nanopolish_ebola_02_NB04.trimmed.rg.sorted.bam 1.6108358000001317
align_trim --normalise 200 /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.scheme.bed --remove-incorrect-pairs --report ebola-nanopolish_ebola_02_NB04.alignreport.txt < ebola-nanopolish_ebola_02_NB04.sorted.bam 2> ebola-nanopolish_ebola_02_NB04.alignreport.er | samtools sort -T ebola-nanopolish_ebola_02_NB04 - -o ebola-nanopolish_ebola_02_NB04.primertrimmed.rg.sorted.bam   1.6200902000000497
samtools index ebola-nanopolish_ebola_02_NB04.trimmed.rg.sorted.bam 0.05613660000017262
samtools index ebola-nanopolish_ebola_02_NB04.primertrimmed.rg.sorted.bam   0.053556400000161375
nanopolish variants --min-flanking-sequence 10 -x 1000000 --progress -t 4 --reads ./ebola-nanopolish_fastq_pass-NB04.fastq -o ebola-nanopolish_ebola_02_NB04.Ebov-DRC_2.vcf -b ebola-nanopolish_ebola_02_NB04.trimmed.rg.sorted.bam -g /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta -w "BTB20484:1-18954" --ploidy 1 -m 0.15 --read-group Ebov-DRC_2     11.021166200000152
nanopolish variants --min-flanking-sequence 10 -x 1000000 --progress -t 4 --reads ./ebola-nanopolish_fastq_pass-NB04.fastq -o ebola-nanopolish_ebola_02_NB04.Ebov-DRC_1.vcf -b ebola-nanopolish_ebola_02_NB04.trimmed.rg.sorted.bam -g /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta -w "BTB20484:1-18954" --ploidy 1 -m 0.15 --read-group Ebov-DRC_1     11.133258400000159
artic_vcf_merge ebola-nanopolish_ebola_02_NB04 /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.scheme.bed 2> ebola-nanopolish_ebola_02_NB04.primersitereport.txt Ebov-DRC_2:ebola-nanopolish_ebola_02_NB04.Ebov-DRC_2.vcf Ebov-DRC_1:ebola-nanopolish_ebola_02_NB04.Ebov-DRC_1.vcf  1.1032102999997733
artic_vcf_filter --nanopolish ebola-nanopolish_ebola_02_NB04.merged.vcf ebola-nanopolish_ebola_02_NB04.pass.vcf ebola-nanopolish_ebola_02_NB04.fail.vcf 1.0789531000000352
bgzip -f ebola-nanopolish_ebola_02_NB04.pass.vcf    0.03394219999972847
tabix -p vcf ebola-nanopolish_ebola_02_NB04.pass.vcf.gz 0.02952839999943535
artic_make_depth_mask --store-rg-depths /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta ebola-nanopolish_ebola_02_NB04.primertrimmed.rg.sorted.bam ebola-nanopolish_ebola_02_NB04.coverage_mask.txt 3.00662310000007
artic_mask /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta ebola-nanopolish_ebola_02_NB04.coverage_mask.txt ebola-nanopolish_ebola_02_NB04.fail.vcf ebola-nanopolish_ebola_02_NB04.preconsensus.fasta   1.1975783999996565
bcftools consensus -f ebola-nanopolish_ebola_02_NB04.preconsensus.fasta ebola-nanopolish_ebola_02_NB04.pass.vcf.gz -m ebola-nanopolish_ebola_02_NB04.coverage_mask.txt -o ebola-nanopolish_ebola_02_NB04.consensus.fasta    0.05268450000039593
artic_fasta_header ebola-nanopolish_ebola_02_NB04.consensus.fasta "ebola-nanopolish_ebola_02_NB04/ARTIC/nanopolish" 0.430017399999997
cat ebola-nanopolish_ebola_02_NB04.consensus.fasta /mnt/c/Users/hasindu/Desktop/interartic_bin/primer-schemes/artic/IturiEBOV/V1/IturiEBOV.reference.fasta > ebola-nanopolish_ebola_02_NB04.muscle.in.fasta 0.025448299999879964
muscle -in ebola-nanopolish_ebola_02_NB04.muscle.in.fasta -out ebola-nanopolish_ebola_02_NB04.muscle.out.fasta  4.533352500000547
hanfan1803 commented 3 years ago

On my ARM Xavier I can get the interARTIC installed as follows (without artic :D)

python3 -m venv interARTIC-venv
source interARTIC-venv/bin/activate
pip install --upgrade pip
pip install celery==4.4.6 redis==3.5.3 flask==1.1.2 pandas

git clone https://github.com/tthnguyen11/interARTIC.git
cd interARTIC 
./run-redis.sh 7777

However, installing artic seems to be a nightmare.

I got this message after interARTIC installation:

Hint: It's a good idea to run 'make test' ;)

make[1]: Leaving directory '/home/nx-1/interARTIC/redis-6.0.12/src' 31093:C 03 Jul 2021 21:45:56.507 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo 31093:C 03 Jul 2021 21:45:56.507 # Redis version=6.0.12, bits=64, commit=c9ee80fc, modified=0, pid=31093, just started 31093:C 03 Jul 2021 21:45:56.508 # Configuration loaded 31093:M 03 Jul 2021 21:45:56.512 # You requested maxclients of 10000 requiring at least 10032 max file descriptors. 31093:M 03 Jul 2021 21:45:56.512 # Server can't set maximum open files to 10032 because of OS error: Operation not permitted. 31093:M 03 Jul 2021 21:45:56.512 # Current maximum open files is 4096. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'. .
_.-__ ''-._ _.- .. ''-. Redis 6.0.12 (c9ee80fc/0) 64 bit .-.-```. ```\/ _.,_ ''-._ ( ' , .-` | `, ) Running in standalone mode |`-._`-...-` __...-.-.|'` .-'| Port: 7777 | -._. / .-' | PID: 31093 -._-. `-./ .-' .-'
|`-.
-._-..-' .-'.-'|
| -._-. .-'.-' | http://redis.io
`-.
-._-.
.-'.-' .-'
|-._-._ -.__.-' _.-'_.-'| |-.`-. .-'.-' |
-._-._-.__.-'_.-' _.-' -._ -.__.-' _.-' -. .-'
`-.__.-'

31093:M 03 Jul 2021 21:45:56.513 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 31093:M 03 Jul 2021 21:45:56.513 # Server initialized 31093:M 03 Jul 2021 21:45:56.513 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 31093:M 03 Jul 2021 21:45:56.513 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never'). 31093:M 03 Jul 2021 21:45:56.519 * Ready to accept connections

How could I use interARTIC like web-interface one at port 5000?

hasindu2008 commented 3 years ago

@hanfan1803 ohh I forgot Control+c to first kill redis pkill redis #to make sure redis is killed ./run-dev.sh

hasindu2008 commented 3 years ago

BTW, I managed to get the necessary things for the artic nanopolish pipeline compiled and made a draft snakeball. I do not have a GUI on my xavier AGX to do testing. Could you check on yours?

https://www.dropbox.com/s/igk9yki081o3uqu/interartic_bin_aarch64_alpha.tar.gz?dl=0

extract the tarball and cd to it then ./run.sh

hanfan1803 commented 3 years ago

Yeah, thanks

One more thing, I followed nanopolish instruction but got this message:

tar -xzf hdf5-1.8.14.tar.gz || exit 255 URL transformed to HTTPS due to an HSTS policy --2021-07-03 19:43:15-- https://bitbucket.org/eigen/eigen/get/3.3.7.tar.bz2 Resolving bitbucket.org (bitbucket.org)... 2406:da00:ff00::22c3:9b0a, 2406:da00:ff00::22c2:513, 2406:da00:ff00::6b17:d1f5, ... Connecting to bitbucket.org (bitbucket.org)|2406:da00:ff00::22c3:9b0a|:443... In file included from src/main/nanopolish.cpp:13:0: ./src/nanopolish_extract.h:12:10: fatal error: fast5.hpp: No such file or directory

include

      ^~~~~~~~~~~

compilation terminated. Makefile:155: recipe for target 'src/main/nanopolish.o' failed make: [src/main/nanopolish.o] Error 1 make: Waiting for unfinished jobs.... src/nanopolish_vcf2fasta.cpp:24:10: fatal error: fast5.hpp: No such file or directory

include

      ^~~~~~~~~~~

compilation terminated. Makefile:155: recipe for target 'src/nanopolish_vcf2fasta.o' failed make: *** [src/nanopolish_vcf2fasta.o] Error 1

gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now Makefile:114: recipe for target 'lib/libhdf5.a' failed make: *** [lib/libhdf5.a] Error 255 connected. HTTP request sent, awaiting response... 404 Not Found 2021-07-03 19:43:16 ERROR 404: Not Found.

Makefile:126: recipe for target 'eigen/INSTALL' failed make: *** [eigen/INSTALL] Error 8

Should I manually install its dependencies, then turn of autoinstallation of dependencies?

hanfan1803 commented 3 years ago

BTW, I managed to get the necessary things for the artic nanopolish pipeline compiled and made a draft snakeball. I do not have a GUI on my xavier AGX to do testing. Could you check on yours?

https://www.dropbox.com/s/igk9yki081o3uqu/interartic_bin_aarch64_alpha.tar.gz?dl=0

extract the tarball and cd to it then ./run.sh

I will copy this into interARTIC then try to run web-interface :D

hasindu2008 commented 3 years ago

Yeah, thanks

One more thing, I followed nanopolish instruction but got this message:

tar -xzf hdf5-1.8.14.tar.gz || exit 255 URL transformed to HTTPS due to an HSTS policy --2021-07-03 19:43:15-- https://bitbucket.org/eigen/eigen/get/3.3.7.tar.bz2 Resolving bitbucket.org (bitbucket.org)... 2406:da00:ff00::22c3:9b0a, 2406:da00:ff00::22c2:513, 2406:da00:ff00::6b17:d1f5, ... Connecting to bitbucket.org (bitbucket.org)|2406:da00:ff00::22c3:9b0a|:443... In file included from src/main/nanopolish.cpp:13:0: ./src/nanopolish_extract.h:12:10: fatal error: fast5.hpp: No such file or directory

include

^~~ compilation terminated. Makefile:155: recipe for target 'src/main/nanopolish.o' failed make: [src/main/nanopolish.o] Error 1 make: Waiting for unfinished jobs.... src/nanopolish_vcf2fasta.cpp:24:10: fatal error: fast5.hpp: No such file or directory

include

^~~ compilation terminated. Makefile:155: recipe for target 'src/nanopolish_vcf2fasta.o' failed make: *** [src/nanopolish_vcf2fasta.o] Error 1

gzip: stdin: unexpected end of file tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now Makefile:114: recipe for target 'lib/libhdf5.a' failed make: *** [lib/libhdf5.a] Error 255 connected. HTTP request sent, awaiting response... 404 Not Found 2021-07-03 19:43:16 ERROR 404: Not Found.

Makefile:126: recipe for target 'eigen/INSTALL' failed make: *** [eigen/INSTALL] Error 8

Should I manually install its dependencies, then turn of autoinstallation of dependencies?

Did you do a git clone --recursive for nanopolish?

That eigen link inside nanopolish is broken, I had to manually download eigen-3.3.7 and hdf5-1.10.4 hwn compiling nanopolish.

hasindu2008 commented 3 years ago

BTW, I managed to get the necessary things for the artic nanopolish pipeline compiled and made a draft snakeball. I do not have a GUI on my xavier AGX to do testing. Could you check on yours? https://www.dropbox.com/s/igk9yki081o3uqu/interartic_bin_aarch64_alpha.tar.gz?dl=0 extract the tarball and cd to it then ./run.sh

I will copy this into interARTIC then try to run web-interface :D

The tarball includes interartic and artic nanopolish both. So you do not need to copy to your own. Just extract and run.

hanfan1803 commented 3 years ago

@hasindu2008 your interartic aarch64 alpha version is working :D Screenshot from 2021-07-03 22-20-20

I am running a test with your sample data (nCoV-19 sample with 10 barcoding)

hanfan1803 commented 3 years ago

@hasindu2008 the aarch64 alpha version is working just fine to get consensus sequences. The alpha version seems lack the matplotlib, so at the end of interARTIC pipeline, the sw cannot plot graphs. I attached the out log file below. all_cmds_log.txt

p/s the artic just release V4 primers for SARS Covid-19, does interARTIC update it into the pipeline?. They do change some format of input files (e.g. references, etc.)

hasindu2008 commented 3 years ago

@hanfan1803 Looks like it has progressed a lot further than I imagined, without a crash. Approximately how much time did it take? Tomorrow I will create a snakeball with matplotlib.

If you ever get to know the steps to install medaka and longshot let me know. For now I will stick to nanopolish pipeline.

hanfan1803 commented 3 years ago

@hasindu2008 It took maybe one hour (I will run it one more time and set a timer :D).

It was confirmed by @miles_benton at https://github.com/sirselim that currently medaka cannot work on ARM and he will publish a tool work as the medaka but dedicate for ARM soon.

Psy-Fer commented 3 years ago

Just to chime in for a moment.

The medaka variant calling stuff only uses CPU. MultiQC is NOT used. They added it in for people to run their own QC at the end. (we don't use that)

I'm glad this is coming along :)

hasindu2008 commented 3 years ago

It took maybe one hour (I will run it one more time and set a timer :D).

An hour is really good, that my laptop took like 30 minutes for the same sample. Yes, medaka is too bulky and I think it is not worth putting in the effort as nanopolish does the job quite well (in the benchmarks I did on my laptop the time taken is comparable).

@sirselim was it the tensorflow that caused issues when you tried medaka?

hanfan1803 commented 3 years ago

@hasindu2008

The nanopolish pipeline actually takes 1h04m with sample data.

In the demultiplexreport.txt file, there are only 24 barcodes in search set. What will happen if I use 96 barcoding kit, or use custom barcoding kits?

I would use the interARTIC for analysis 7-10 antibiotic resistant genes of TB with barcoding kit (12, 24, 96). If I do so, I will treat each genes as a virus sample in the interARTIC, so I should able to custom either references.fasta files and primers scheme. After that run interARTIC like 7-10 times to get according consensus sequences of target genes. Does interARTIC support custom references.fasta and primers scheme? If yes, how can I config/prepare theses files with correct interARTIC format?

hasindu2008 commented 3 years ago

@hasindu2008

The nanopolish pipeline actually takes 1h04m with sample data.

In the demultiplexreport.txt file, there are only 24 barcodes in search set. What will happen if I use 96 barcoding kit, or use custom barcoding kits?

I would use the interARTIC for analysis 7-10 antibiotic resistant genes of TB with barcoding kit (12, 24, 96). If I do so, I will treat each genes as a virus sample in the interARTIC, so I should able to custom either references.fasta files and primers scheme. After that run interARTIC like 7-10 times to get according consensus sequences of target genes. Does interARTIC support custom references.fasta and primers scheme? If yes, how can I config/prepare theses files with correct interARTIC format?

I think @Psy-Fer will be the one to answer those. In summary, yes custom references and primers are supported. @Psy-Fer will point you to the instructions.

hasindu2008 commented 3 years ago

@hanfan1803 Would you mind checking if plotting works in this latest build I did? https://www.dropbox.com/s/4h0a335bvg755im/interartic_bin_aarch64_beta.tar.gz?dl=0

hanfan1803 commented 3 years ago

@hasindu2008 The beta version had run without bug or crash. The plotting works are working.

Thanks

Psy-Fer commented 3 years ago

In the demultiplexreport.txt file, there are only 24 barcodes in search set. What will happen if I use 96 barcoding kit, or use custom barcoding kits?

The demultiplexing is only limited by what the porechop installed can manage, currnetly:

If there is a set it can't handle, then we will need to update that in the porechop library. The other option is to reconfigure the workflow to take data already demultiplexed by MinKNOW/Guppy, but that required quite a bit of work to implement.

So it can't handle the new 96 barcoding kits (rapid/native), so we would have to do something to fix that (unless the artic guys have already patched porechop, though i doubt it given they have moved to guppy_barcoder as their default, but this is closed source, which isn't great for open source software like this).

I would use the interARTIC for analysis 7-10 antibiotic resistant genes of TB with barcoding kit (12, 24, 96). If I do so, I will treat each genes as a virus sample in the interARTIC, so I should able to custom either references.fasta files and primers scheme. After that run interARTIC like 7-10 times to get according consensus sequences of target genes. Does interARTIC support custom references.fasta and primers scheme? If yes, how can I config/prepare theses files with correct interARTIC format?

Yes, InterARTIC does allow for custom primer schemes/references. Build your scheme in the same way these ones are built https://github.com/Psy-Fer/interARTIC/tree/master/primer-schemes with your reference files.

Then in the parameter setup, where you select the virus you wish to analyse, select "custom", and give the path, name/version in the relevant fields, and as long as it's compatible with the artic pipeline, it should work.

So if you wanted to use the midnight primer scheme in the custom field (for example), you would give it these two values ${HOME}/interartic_bin/primer-schemes/midnight (this sets the type) nCoV-2019/V1 ( This sets the virus and version)

So you could set it up something like

${HOME}/interartic_bin/primer-schemes/mynewvirus

TB/V1
TB/V2
TB/V3
TB/V4
...

where each version, holds the primer scheme and reference for your particular target. The .fai file can be ignored, as it get's created when the alignments are done, unless you won't have write access to the folder containing the reference, in which case you can index it with minimap2.

Try to stick as close to what has already been made for the other schemes as you can, and you should be fine.

I hope that helps.

hasindu2008 commented 3 years ago

@hanfan1803

We just released interARTIC v0.4 with some improvements (mainly now it supports artic v4 primers and guppy demultiplexed data) - includes an aarch64 release as well now.

Also @Psy-Fer has added a nice guide on using custom primers and viruses at https://psy-fer.github.io/interARTIC/primers/

Give it a go on your Jetson NX and let us know how it goes.

hanfan1803 commented 3 years ago

@hasindu2008 @Psy-Fer thank you a lot for the new aarch64 update, I am ordering artic v4 primers. I will notify you guys know about our experience with the new update soon.

Thanks

hanfan1803 commented 3 years ago

Hi @hasindu2008 I just try interARTIC v4 with midnight primers, 12 barcoding, rapid ligation and I got this issue. image

interartic.log all_cmds_log.txt

hasindu2008 commented 3 years ago

ahh, it is this Nanopore's latest vbz compressed FAST5 files. Could you please run the following script to install the vbz plugin first https://github.com/hasindu2008/slow5tools/blob/master/scripts/install-vbz.sh? Then, launch the run.sh?

hanfan1803 commented 3 years ago

i tried but then

image

hasindu2008 commented 3 years ago

ohh you downloaded the HTML page rather than the script from github. If you are 'wgetting' or 'curling' use this link https://raw.githubusercontent.com/hasindu2008/slow5tools/master/scripts/install-vbz.sh

hanfan1803 commented 3 years ago

I installed the latest version of vbz compress from the source and it works just fine.

Thanks for your suggestion.

hasindu2008 commented 3 years ago

Thank you, in the next release I will include this plugin inside interARTIC itself.

hasindu2008 commented 3 years ago

@hanfan1803 The latest 0.4.1 release have vbz plugin built in. I will close this issue for now. Please reopen or open a new issue if you encounter problems or want to request new features. Thank you very much for testing.

hasindu2008 commented 2 years ago

@hanfan1803 Would it be alright if we include your name in the acknowledgement in our interARTIC manuscript for helping us test on ARM? If so could you provide your name and details?

hanfan1803 commented 2 years ago

Hi @hasindu2008, it's my honor.

My information is: Han N. Phan. NTT Hi-Tech Institute, Nguyen Tat Thanh University, Ho Chi Minh City, Vietnam. pnhan@ntt.edu.vn

I hope your group keep supporting arm devices for the coming projects.

Your interARTIC for ARM has helped us a lot when sequencing covid-19, and future multiple antibiotic resistance sequencing project.