KennthShang / PhaGCN2.0

26 stars 10 forks source link

1. It runs slowly, how to accelerate. Can metagenomes be used as input? #7

Open TiAmoTYX opened 8 months ago

TiAmoTYX commented 8 months ago

input:!python run_Speed_up.py --contigs ComND.Sep_nonredundant.fasta outputs: folder pred exist... cleaning dictionary Dictionary cleaned Creating Diamond database... diamond v0.9.14.115 | by Benjamin Buchfink [buchfink@gmail.com](mailto:buchfink@gmail.com) Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) Database file: database/ALL_protein.fasta Opening the database file... [8.6e-05s] Loading sequences... [1.03251s] Masking sequences... [19.8524s] Writing sequences... [0.135956s] Loading sequences... [8e-06s] Writing trailer... [0.003759s] Closing the input file... [1e-05s] Closing the database file... [1.9e-05s] Processed 355277 sequences, 89213859 letters. Total time = 21.0249s Running Diamond... diamond v0.9.14.115 | by Benjamin Buchfink [buchfink@gmail.com](mailto:buchfink@gmail.com) Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)

Target sequences to report alignments for: 25

Temporary directory: database Opening the database... [1.3e-05s] Opening the input file... [2.2e-05s] Opening the output file... [2.6e-05s] Loading query sequences... [0.463304s] Masking queries... [20.5479s] Building query seed set... [0.000739s] Algorithm: Double-indexed Building query histograms... [13.9358s] Allocating buffers... [0.004801s] Loading reference sequences... [0.296878s] Building reference histograms... [10.6063s] Allocating buffers... [0.00364s] Initializing temporary storage... [0.04697s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 0. Building reference index... [2.77132s] Building query index... [2.70239s] Building seed filter... [0.192776s] Searching alignments... [194.589s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 1. Building reference index... [1.98594s] Building query index... [1.94714s] Building seed filter... [0.135148s] Searching alignments... [180.56s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 2. Building reference index... [2.16275s] Building query index... [2.58315s] Building seed filter... [0.242484s] Searching alignments... [177.161s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 3. Building reference index... [1.82576s] Building query index... [1.79691s] Building seed filter... [0.135568s] Searching alignments... [178.369s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 0. Building reference index... [1.7556s] Building query index... [1.77965s] Building seed filter... [0.140798s] Searching alignments... [158.161s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 1. Building reference index... [1.96956s] Building query index... [2.29852s] Building seed filter... [0.239073s] Searching alignments... [160.846s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 2. Building reference index... [2.95242s] Building query index... [2.85216s] Building seed filter... [0.206348s] Searching alignments... [158.888s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 3. Building reference index... [2.5536s] Building query index... [2.54275s] Building seed filter... [0.141991s] Searching alignments... [157.112s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 0. Building reference index... [2.41696s] Building query index... [1.78749s] Building seed filter... [0.139807s] Searching alignments... [166.157s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 1. Building reference index... [2.36725s] Building query index... [3.19314s] Building seed filter... [0.230165s] Searching alignments... [168.983s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 2. Building reference index... [2.08511s] Building query index... [2.02835s] Building seed filter... [0.13463s] Searching alignments... [166.332s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 3. Building reference index... [2.54531s] Building query index... [2.73737s] Building seed filter... [0.19258s] Searching alignments... [163.168s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 0. Building reference index... [1.88082s] Building query index... [1.84685s] Building seed filter... [0.13918s] Searching alignments... [163.837s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 1. Building reference index... [2.73969s] Building query index... [1.9788s] Building seed filter... [0.145996s] Searching alignments... [158.893s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 2. Building reference index... [2.10557s] Building query index... [3.11283s] Building seed filter... [0.208613s] Searching alignments... [159.014s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 3. Building reference index... [2.51071s] Building query index... [1.84304s] Building seed filter... [0.138735s] Searching alignments... [157.097s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 0. Building reference index... [1.77338s] Building query index... [1.77826s] Building seed filter... [0.138602s] Searching alignments... [161.503s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 1. Building reference index... [2.79382s] Building query index... [2.34932s] Building seed filter... [0.142117s] Searching alignments... [156.45s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 2. Building reference index... [1.99907s] Building query index... [1.99978s] Building seed filter... [0.138183s] Searching alignments... [155.676s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 3. Building reference index... [2.63258s] Building query index... [2.4668s] Building seed filter... [0.19429s] Searching alignments... [157.919s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 0. Building reference index... [1.98301s] Building query index... [1.78562s] Building seed filter... [0.141065s] Searching alignments... [157.891s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 1. Building reference index... [1.91915s] Building query index... [2.90744s] Building seed filter... [0.191985s] Searching alignments... [159.187s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 2. Building reference index... [2.89677s] Building query index... [2.06991s] Building seed filter... [0.135317s] Searching alignments... [155.446s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 3. Building reference index... [1.76301s] Building query index... [1.76524s] Building seed filter... [0.137115s] Searching alignments... [156.355s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 0. Building reference index... [2.70931s] Building query index... [2.44232s] Building seed filter... [0.210021s] Searching alignments... [153.192s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 1. Building reference index... [2.81729s] Building query index... [2.81792s] Building seed filter... [0.220948s] Searching alignments... [153.437s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 2. Building reference index... [2.04254s] Building query index... [1.99006s] Building seed filter... [0.139398s] Searching alignments... [152.918s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 3. Building reference index... [1.75688s] Building query index... [1.75614s] Building seed filter... [0.13578s] Searching alignments... [152.378s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 0. Building reference index... [1.77747s] Building query index... [1.76655s] Building seed filter... [0.14206s] Searching alignments... [159.664s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 1. Building reference index... [3.05402s] Building query index... [2.73481s] Building seed filter... [0.135822s] Searching alignments... [157.53s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 2. Building reference index... [2.03399s] Building query index... [1.98592s] Building seed filter... [0.139896s] Searching alignments... [163.199s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 3. Building reference index... [2.51295s] Building query index... [2.24477s] Building seed filter... [0.14088s] Searching alignments... [159.357s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 0. Building reference index... [1.83476s] Building query index... [1.83461s] Building seed filter... [0.143065s] Searching alignments... [167.139s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 1. Building reference index... [1.99016s] Building query index... [1.97977s] Building seed filter... [0.144597s] Searching alignments... [163.77s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 2. Building reference index... [2.99346s] Building query index... [2.8226s] Building seed filter... [0.138908s] Searching alignments... [161.696s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 3. Building reference index... [2.26193s] Building query index... [2.70167s] Building seed filter... [0.237773s] Searching alignments... [163.447s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 0. Building reference index... [1.76149s] Building query index... [1.78078s] Building seed filter... [0.138977s] Searching alignments... [151.89s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 1. Building reference index... [1.91701s] Building query index... [1.89495s] Building seed filter... [0.135462s] Searching alignments... [152.142s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 2. Building reference index... [1.95088s] Building query index... [2.86783s] Building seed filter... [0.183687s] Searching alignments... [150.635s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 3. Building reference index... [2.64842s] Building query index... [2.52154s] Building seed filter... [0.193807s] Searching alignments... [152.376s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 0. Building reference index... [2.63351s] Building query index... [2.4957s] Building seed filter... [0.188702s] Searching alignments... [160.252s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 1. Building reference index... [1.9763s] Building query index... [1.97704s] Building seed filter... [0.136681s] Searching alignments... [162.237s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 2. Building reference index... [2.75475s] Building query index... [2.48302s] Building seed filter... [0.138881s] Searching alignments... [159.801s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 3. Building reference index... [1.77611s] Building query index... [2.73928s] Building seed filter... [0.210197s] Searching alignments... [164.105s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 0. Building reference index... [1.76388s] Building query index... [1.75482s] Building seed filter... [0.135396s] Searching alignments... [158.67s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 1. Building reference index... [3.03792s] Building query index... [2.68562s] Building seed filter... [0.190658s] Searching alignments... [155.332s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 2. Building reference index... [2.71286s] Building query index... [2.23418s] Building seed filter... [0.135291s] Searching alignments... [153.041s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 3. Building reference index... [1.72534s] Building query index... [1.77007s] Building seed filter... [0.14015s] Searching alignments... [154.226s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 0. Building reference index... [1.84135s] Building query index... [1.84928s] Building seed filter... [0.149452s] Searching alignments... [156.875s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 1. Building reference index... [2.80547s] Building query index... [2.65604s] Building seed filter... [0.191667s] Searching alignments... [156.386s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 2. Building reference index... [1.9844s] Building query index... [2.00921s] Building seed filter... [0.139653s] Searching alignments... [154.626s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 3. Building reference index... [1.7737s] Building query index... [1.77326s] Building seed filter... [0.141034s] Searching alignments... [155.39s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 0. Building reference index... [2.5683s] Building query index... [2.62349s] Building seed filter... [0.211697s] Searching alignments...

TiAmoTYX commented 8 months ago

Hi,wenguang My project is running slowly. Is there any way to accelerate it? The sequencing depth of my metagenomic data is not as high, and the selected virus metagenomes are smaller. Can I directly use the metagenomic data as input?

yuanwenguang666 commented 2 months ago

Sorry for my slow response.

The first question you can run our program by multithreading ways. The specific method of multi-threaded running procedures can be referred tohttps://github.com/KennthShang/PhaGCN2.0/issues/3.

The second question, PhaGCN2 does not have the ability to predict whether a sequence is a virus, so I recommend that you use tools like CheckV (https://anaconda.org/bioconda/checkv) determine if a sequence is a virus before using PhaGCN2.0.

Thank you for your question.

All the best, Wen-Guang