KennthShang / PhaGCN2.0

26 stars 10 forks source link

I input 1000 sequences (seq len> length cutoff), but only 900+ or 800+ appeared in "single_contig/" and the final_prediction.csv #9

Closed actledge closed 2 months ago

actledge commented 4 months ago

for one example, 1000 sequences input, 861 in single_contig/ directory generated by PhaGCN2.0 and 855 results in final_prediction.csv.

this is my command: python3 run_Speed_up.py --contigs ../../merged_prophages/merged_1000prophage_ab --len 2000

all sequences with length>2000bp, and most of them(>950 sequences) >8000bp But only 861 splited in single_contig, and 855 has predicted families. I try to look at the stdout, but I can't find for what reason a part of the sequence was removed.

Below is the stdout of PhaGCN2.0 of this run:

diamond v0.9.14.115 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) Database file: database/ALL_protein.fasta Opening the database file... [0.01789s] Loading sequences... [0.590148s] Masking sequences... [2.51824s] Writing sequences... [0.209732s] Loading sequences... [3.3e-05s] Writing trailer... [0.003901s] Closing the input file... [0.000243s] Closing the database file... [0.000204s] Processed 355277 sequences, 89213859 letters. Total time = 3.34066s diamond v0.9.14.115 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)

Target sequences to report alignments for: 25

Temporary directory: database Opening the database... [0.000183s] Opening the input file... [0.000517s] Opening the output file... [0.000259s] Loading query sequences... [0.389211s] Masking queries... [2.51574s] Building query seed set... [0.00069s] Algorithm: Double-indexed Building query histograms... [1.09677s] Allocating buffers... [0.00427s] Loading reference sequences... [0.20779s] Building reference histograms... [1.09783s] Allocating buffers... [0.003992s] Initializing temporary storage... [0.722539s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 0. Building reference index... [0.283269s] Building query index... [0.27484s] Building seed filter... [0.025209s] Searching alignments... [19.8627s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 1. Building reference index... [0.311584s] Building query index... [0.294934s] Building seed filter... [0.024621s] Searching alignments... [16.8565s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 2. Building reference index... [0.382433s] Building query index... [0.317564s] Building seed filter... [0.025139s] Searching alignments... [16.3031s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 3. Building reference index... [0.274689s] Building query index... [0.265093s] Building seed filter... [0.026024s] Searching alignments... [15.9754s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 0. Building reference index... [0.267498s] Building query index... [0.266023s] Building seed filter... [0.025164s] Searching alignments... [14.4003s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 1. Building reference index... [0.298033s] Building query index... [0.292235s] Building seed filter... [0.024391s] Searching alignments... [14.6411s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 2. Building reference index... [0.313034s] Building query index... [0.301072s] Building seed filter... [0.024871s] Searching alignments... [14.4521s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 3. Building reference index... [0.262411s] Building query index... [0.256059s] Building seed filter... [0.024758s] Searching alignments... [14.3917s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 0. Building reference index... [0.265218s] Building query index... [0.266968s] Building seed filter... [0.02445s] Searching alignments... [15.2328s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 1. Building reference index... [0.302659s] Building query index... [0.300121s] Building seed filter... [0.026636s] Searching alignments... [15.3422s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 2. Building reference index... [0.392863s] Building query index... [0.326037s] Building seed filter... [0.026308s] Searching alignments... [15.3762s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 3. Building reference index... [0.285002s] Building query index... [0.309236s] Building seed filter... [0.029408s] Searching alignments... [15.3907s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 0. Building reference index... [0.281797s] Building query index... [0.265508s] Building seed filter... [0.025928s] Searching alignments... [15.0185s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 1. Building reference index... [0.305271s] Building query index... [0.297462s] Building seed filter... [0.024591s] Searching alignments... [15.0288s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 2. Building reference index... [0.319434s] Building query index... [0.316996s] Building seed filter... [0.026815s] Searching alignments... [14.8484s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 3. Building reference index... [0.274905s] Building query index... [0.260729s] Building seed filter... [0.025586s] Searching alignments... [15.0292s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 0. Building reference index... [0.259619s] Building query index... [0.259s] Building seed filter... [0.024649s] Searching alignments... [14.8923s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 1. Building reference index... [0.298625s] Building query index... [0.292324s] Building seed filter... [0.025175s] Searching alignments... [15.8937s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 2. Building reference index... [0.319582s] Building query index... [0.308277s] Building seed filter... [0.025682s] Searching alignments... [14.6923s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 3. Building reference index... [0.277406s] Building query index... [0.261891s] Building seed filter... [0.026275s] Searching alignments... [14.929s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 0. Building reference index... [0.311551s] Building query index... [0.27793s] Building seed filter... [0.026442s] Searching alignments... [15.1107s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 1. Building reference index... [0.304051s] Building query index... [0.299856s] Building seed filter... [0.025224s] Searching alignments... [14.9284s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 2. Building reference index... [0.320753s] Building query index... [0.307826s] Building seed filter... [0.024642s] Searching alignments... [14.779s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 3. Building reference index... [0.261098s] Building query index... [0.257086s] Building seed filter... [0.026662s] Searching alignments... [14.8119s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 0. Building reference index... [0.262926s] Building query index... [0.263373s] Building seed filter... [0.025155s] Searching alignments... [14.5588s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 1. Building reference index... [0.29295s] Building query index... [0.292022s] Building seed filter... [0.025948s] Searching alignments... [14.6158s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 2. Building reference index... [0.311916s] Building query index... [0.308026s] Building seed filter... [0.025351s] Searching alignments... [14.5627s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 3. Building reference index... [0.260355s] Building query index... [0.269366s] Building seed filter... [0.02624s] Searching alignments... [14.3568s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 0. Building reference index... [0.263429s] Building query index... [0.259393s] Building seed filter... [0.023786s] Searching alignments... [14.8437s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 1. Building reference index... [0.297029s] Building query index... [0.291271s] Building seed filter... [0.025577s] Searching alignments... [15.0007s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 2. Building reference index... [0.315557s] Building query index... [0.309132s] Building seed filter... [0.025421s] Searching alignments... [14.7984s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 3. Building reference index... [0.266881s] Building query index... [0.256021s] Building seed filter... [0.024408s] Searching alignments... [14.8613s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 0. Building reference index... [0.262579s] Building query index... [0.257655s] Building seed filter... [0.025664s] Searching alignments... [15.2295s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 1. Building reference index... [0.298764s] Building query index... [0.299266s] Building seed filter... [0.025966s] Searching alignments... [15.168s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 2. Building reference index... [0.319456s] Building query index... [0.308729s] Building seed filter... [0.024608s] Searching alignments... [15.2676s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 3. Building reference index... [0.270841s] Building query index... [0.264207s] Building seed filter... [0.024932s] Searching alignments... [15.3816s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 0. Building reference index... [0.263932s] Building query index... [0.262825s] Building seed filter... [0.025301s] Searching alignments... [14.4513s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 1. Building reference index... [0.301304s] Building query index... [0.291002s] Building seed filter... [0.025083s] Searching alignments... [14.5634s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 2. Building reference index... [0.311886s] Building query index... [0.307125s] Building seed filter... [0.024389s] Searching alignments... [14.4054s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 3. Building reference index... [0.26674s] Building query index... [0.256262s] Building seed filter... [0.02536s] Searching alignments... [14.492s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 0. Building reference index... [0.265715s] Building query index... [0.261112s] Building seed filter... [0.025149s] Searching alignments... [15.1314s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 1. Building reference index... [0.305626s] Building query index... [0.294904s] Building seed filter... [0.025342s] Searching alignments... [15.2309s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 2. Building reference index... [0.320872s] Building query index... [0.316694s] Building seed filter... [0.024998s] Searching alignments... [15.1713s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 3. Building reference index... [0.260531s] Building query index... [0.258487s] Building seed filter... [0.025103s] Searching alignments... [15.3755s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 0. Building reference index... [0.26193s] Building query index... [0.284297s] Building seed filter... [0.025176s] Searching alignments... [14.6723s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 1. Building reference index... [0.292889s] Building query index... [0.289912s] Building seed filter... [0.024455s] Searching alignments... [14.658s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 2. Building reference index... [0.309579s] Building query index... [0.302704s] Building seed filter... [0.02483s] Searching alignments... [14.5503s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 3. Building reference index... [0.260402s] Building query index... [0.254705s] Building seed filter... [0.022964s] Searching alignments... [14.6408s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 0. Building reference index... [0.264545s] Building query index... [0.260019s] Building seed filter... [0.025533s] Searching alignments... [14.7112s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 1. Building reference index... [0.296742s] Building query index... [0.294477s] Building seed filter... [0.024755s] Searching alignments... [14.8003s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 2. Building reference index... [0.320262s] Building query index... [0.308086s] Building seed filter... [0.025445s] Searching alignments... [14.7036s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 3. Building reference index... [0.26371s] Building query index... [0.254884s] Building seed filter... [0.030946s] Searching alignments... [14.6497s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 0. Building reference index... [0.269167s] Building query index... [0.255584s] Building seed filter... [0.026428s] Searching alignments... [14.9061s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 1. Building reference index... [0.300151s] Building query index... [0.30065s] Building seed filter... [0.024835s] Searching alignments... [15.0832s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 2. Building reference index... [0.310774s] Building query index... [0.306496s] Building seed filter... [0.025767s] Searching alignments... [14.9718s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 3. Building reference index... [0.267171s] Building query index... [0.255494s] Building seed filter... [0.025212s] Searching alignments... [14.8243s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 0. Building reference index... [0.264856s] Building query index... [0.266763s] Building seed filter... [0.025114s] Searching alignments... [15.0865s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 1. Building reference index... [0.303294s] Building query index... [0.29382s] Building seed filter... [0.023943s] Searching alignments... [15.2816s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 2. Building reference index... [0.315379s] Building query index... [0.305534s] Building seed filter... [0.024536s] Searching alignments... [15.0456s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 3. Building reference index... [0.256008s] Building query index... [0.251722s] Building seed filter... [0.023423s] Searching alignments... [15.2821s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 0. Building reference index... [0.259813s] Building query index... [0.257348s] Building seed filter... [0.025391s] Searching alignments... [15.0535s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 1. Building reference index... [0.299897s] Building query index... [0.287854s] Building seed filter... [0.02488s] Searching alignments... [14.5973s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 2. Building reference index... [0.306047s] Building query index... [0.30063s] Building seed filter... [0.025421s] Searching alignments... [14.6525s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 3. Building reference index... [0.258485s] Building query index... [0.258179s] Building seed filter... [0.024416s] Searching alignments... [14.9748s] Deallocating buffers... [0.000819s] Computing alignments... [134.476s] Deallocating reference... [0.001597s] Loading reference sequences... [3.8e-05s] Deallocating buffers... [0.001419s] Deallocating queries... [0.000802s] Loading query sequences... [3.1e-05s] Closing the input file... [0.00018s] Closing the output file... [0.000192s] Closing the database file... [0.000154s] Total time = 1141.65s Reported 6420544 pairwise alignments, 6431184 HSPs. 354845 queries aligned. rm: cannot remove ‘validation/’: No such file or directory rm: cannot remove ‘stride50_val/’: No such file or directory rm: cannot remove ‘int_val/’: No such file or directory rm: cannot remove ‘filtered_val/’: No such file or directory rm: cannot remove ‘dataset/’: No such file or directory rm: cannot remove ‘split_long_reads_val/’: No such file or directory /public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /opt/conda/conda-bld/pytorch_1607370116979/work/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 Capturing compressed features Running with cpu .................................................. 1M .................................................. 2M .................................................. 3M .................................................. 4M .................................................. 5M .................................................. 6M ................ [mclIO] writing <out/merged.mci> ....................................... [mclIO] wrote native interchange 350184x350184 matrix with 8169360 entries to stream <out/merged.mci> [mclIO] wrote 350184 tab entries to stream <out/merged_mcxload.tab> [mcxload] tab has 350184 entries [mclIO] reading <out/merged.mci> ....................................... [mclIO] read native interchange 350184x350184 matrix with 8169360 entries [mcl] pid 459784 ite ------------------- chaos time hom(avg,lo,hi) m-ie m-ex i-ex fmv 1 ................... 76.85 4.26 1.08/0.00/7.54 2.52 2.42 2.42 0 2 ................... 77.98 17.33 0.84/0.02/4.14 2.21 0.94 2.28 3 3 ................... 38.20 12.67 0.79/0.08/6.36 1.69 0.74 1.68 1 4 ................... 30.21 5.77 0.78/0.06/5.70 1.44 0.70 1.17 0 5 ................... 21.30 2.58 0.78/0.09/4.63 1.26 0.68 0.80 0 6 ................... 17.64 1.25 0.79/0.13/4.85 1.11 0.70 0.56 0 7 ................... 8.88 0.76 0.80/0.15/3.04 1.04 0.75 0.42 0 8 ................... 7.03 0.55 0.83/0.17/1.83 1.01 0.79 0.33 0 9 ................... 4.78 0.43 0.86/0.16/1.30 1.01 0.80 0.26 0 10 ................... 4.67 0.35 0.90/0.20/1.72 1.00 0.81 0.21 0 11 ................... 4.81 0.29 0.93/0.21/1.16 1.00 0.82 0.17 0 12 ................... 4.74 0.26 0.96/0.20/1.16 1.00 0.83 0.14 0 13 ................... 3.62 0.22 0.97/0.26/1.11 1.00 0.86 0.12 0 14 ................... 5.15 0.20 0.98/0.21/1.09 1.00 0.90 0.11 0 15 ................... 5.19 0.19 0.99/0.27/1.01 1.00 0.94 0.11 0 16 ................... 3.26 0.19 0.99/0.26/1.00 1.00 0.96 0.10 0 17 ................... 3.94 0.17 1.00/0.22/1.00 1.00 0.97 0.10 0 18 ................... 4.27 0.18 1.00/0.39/1.00 1.00 0.98 0.10 0 19 ................... 3.33 0.18 1.00/0.26/1.00 1.00 0.99 0.10 0 20 ................... 3.21 0.17 1.00/0.34/1.00 1.00 0.99 0.10 0 21 ................... 1.71 0.17 1.00/0.39/1.00 1.00 1.00 0.09 0 22 ................... 3.89 0.18 1.00/0.23/1.00 1.00 1.00 0.09 0 23 ................... 1.95 0.17 1.00/0.67/1.00 1.00 1.00 0.09 0 24 ................... 0.75 0.18 1.00/0.55/1.00 1.00 1.00 0.09 0 25 ................... 0.37 0.16 1.00/0.79/1.00 1.00 1.00 0.09 0 26 ................... 0.25 0.17 1.00/0.76/1.00 1.00 1.00 0.09 0 27 ................... 0.15 0.17 1.00/0.85/1.00 1.00 1.00 0.09 0 28 ................... 0.02 0.18 1.00/0.98/1.00 1.00 1.00 0.09 0 29 ................... 0.00 0.17 1.00/1.00/1.00 1.00 1.00 0.09 0 30 ................... 0.00 0.18 1.00/1.00/1.00 1.00 1.00 0.09 0 [mcl] jury pruning marks: <99,98,99>, out of 100 [mcl] jury pruning synopsis: <98.8 or marvelous> (cf -scheme, -do log) [mcl] output is in out/merged_mcl20.clusters [mcl] 30656 clusters found [mcl] output is in out/merged_mcl20.clusters

Please cite: Stijn van Dongen, Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht, May 2000. ( http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf or http://micans.org/mcl/lit/svdthesis.pdf.gz) OR Stijn van Dongen, A cluster algorithm for graphs. Technical Report INS-R0010, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. ( http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z or http://micans.org/mcl/lit/INS-R0010.ps.Z)

/public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/Bio/Seq.py:2338: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future. BiopythonWarning, run_KnowledgeGraph.py:410: RuntimeWarning: divide by zero encountered in log10 sig = min(max_sig, np.nan_to_num(-np.log10(pval) - logT))

---------------------------------Diamond BLASTp--------------------------------- Creating Diamond database and running Diamond... Creating Diamond database... Running Diamond...

-------------------------------Protein clustering------------------------------- Loading proteins... Running MCL... Building the cluster and profiles (this may take some time...) Using MCL to generate PCs. Saving files Read 10306 entries from out/pcs_contigs.csv Read 336924 entries (dropped 5239 singletons) from out/Cyber_profiles.csv .......... 0.66% 10000/1510073.0 .......... 1.32% 20000/1510073.0 .......... 1.99% 30000/1510073.0 .......... 2.65% 40000/1510073.0 .......... 3.31% 50000/1510073.0 .......... 3.97% 60000/1510073.0 .......... 4.64% 70000/1510073.0 .......... 5.30% 80000/1510073.0 .......... 5.96% 90000/1510073.0 .......... 6.62% 100000/1510073.0 .......... 7.28% 110000/1510073.0 .......... 7.95% 120000/1510073.0 .......... 8.61% 130000/1510073.0 .......... 9.27% 140000/1510073.0 .......... 9.93% 150000/1510073.0 ..........10.60% 160000/1510073.0 ..........11.26% 170000/1510073.0 ..........11.92% 180000/1510073.0 ..........12.58% 190000/1510073.0 ..........13.24% 200000/1510073.0 ..........13.91% 210000/1510073.0 ..........14.57% 220000/1510073.0 ..........15.23% 230000/1510073.0 ..........15.89% 240000/1510073.0 ..........16.56% 250000/1510073.0 ..........17.22% 260000/1510073.0 ..........17.88% 270000/1510073.0 ..........18.54% 280000/1510073.0 ..........19.20% 290000/1510073.0 ..........19.87% 300000/1510073.0 ..........20.53% 310000/1510073.0 ..........21.19% 320000/1510073.0 ..........21.85% 330000/1510073.0 ..........22.52% 340000/1510073.0 ..........23.18% 350000/1510073.0 ..........23.84% 360000/1510073.0 ..........24.50% 370000/1510073.0 ..........25.16% 380000/1510073.0 ..........25.83% 390000/1510073.0 ..........26.49% 400000/1510073.0 ..........27.15% 410000/1510073.0 ..........27.81% 420000/1510073.0 ..........28.48% 430000/1510073.0 ..........29.14% 440000/1510073.0 ..........29.80% 450000/1510073.0 ..........30.46% 460000/1510073.0 ..........31.12% 470000/1510073.0 ..........31.79% 480000/1510073.0 ..........32.45% 490000/1510073.0 ..........33.11% 500000/1510073.0 ..........33.77% 510000/1510073.0 ..........34.44% 520000/1510073.0 ..........35.10% 530000/1510073.0 ..........35.76% 540000/1510073.0 ..........36.42% 550000/1510073.0 ..........37.08% 560000/1510073.0 ..........37.75% 570000/1510073.0 ..........38.41% 580000/1510073.0 ..........39.07% 590000/1510073.0 ..........39.73% 600000/1510073.0 ..........40.40% 610000/1510073.0 ..........41.06% 620000/1510073.0 ..........41.72% 630000/1510073.0 ..........42.38% 640000/1510073.0 ..........43.04% 650000/1510073.0 ..........43.71% 660000/1510073.0 ..........44.37% 670000/1510073.0 ..........45.03% 680000/1510073.0 ..........45.69% 690000/1510073.0 ..........46.36% 700000/1510073.0 ..........47.02% 710000/1510073.0 ..........47.68% 720000/1510073.0 ..........48.34% 730000/1510073.0 ..........49.00% 740000/1510073.0 ..........49.67% 750000/1510073.0 ..........50.33% 760000/1510073.0 ..........50.99% 770000/1510073.0 ..........51.65% 780000/1510073.0 ..........52.32% 790000/1510073.0 ..........52.98% 800000/1510073.0 ..........53.64% 810000/1510073.0 ..........54.30% 820000/1510073.0 ..........54.96% 830000/1510073.0 ..........55.63% 840000/1510073.0 ..........56.29% 850000/1510073.0 ..........56.95% 860000/1510073.0 ..........57.61% 870000/1510073.0 ..........58.28% 880000/1510073.0 ..........58.94% 890000/1510073.0 ..........59.60% 900000/1510073.0 ..........60.26% 910000/1510073.0 ..........60.92% 920000/1510073.0 ..........61.59% 930000/1510073.0 ..........62.25% 940000/1510073.0 ..........62.91% 950000/1510073.0 ..........63.57% 960000/1510073.0 ..........64.24% 970000/1510073.0 ..........64.90% 980000/1510073.0 ..........65.56% 990000/1510073.0 ..........66.22% 1000000/1510073.0 ..........66.88% 1010000/1510073.0 ..........67.55% 1020000/1510073.0 ..........68.21% 1030000/1510073.0 ..........68.87% 1040000/1510073.0 ..........69.53% 1050000/1510073.0 ..........70.20% 1060000/1510073.0 ..........70.86% 1070000/1510073.0 ..........71.52% 1080000/1510073.0 ..........72.18% 1090000/1510073.0 ..........72.84% 1100000/1510073.0 ..........73.51% 1110000/1510073.0 ..........74.17% 1120000/1510073.0 ..........74.83% 1130000/1510073.0 ..........75.49% 1140000/1510073.0 ..........76.16% 1150000/1510073.0 ..........76.82% 1160000/1510073.0 ..........77.48% 1170000/1510073.0 ..........78.14% 1180000/1510073.0 ..........78.80% 1190000/1510073.0 ..........79.47% 1200000/1510073.0 ..........80.13% 1210000/1510073.0 ..........80.79% 1220000/1510073.0 ..........81.45% 1230000/1510073.0 ..........82.12% 1240000/1510073.0 ..........82.78% 1250000/1510073.0 ..........83.44% 1260000/1510073.0 ..........84.10% 1270000/1510073.0 ..........84.76% 1280000/1510073.0 ..........85.43% 1290000/1510073.0 ..........86.09% 1300000/1510073.0 ..........86.75% 1310000/1510073.0 ..........87.41% 1320000/1510073.0 ..........88.08% 1330000/1510073.0 ..........88.74% 1340000/1510073.0 ..........89.40% 1350000/1510073.0 ..........90.06% 1360000/1510073.0 ..........90.72% 1370000/1510073.0 ..........91.39% 1380000/1510073.0 ..........92.05% 1390000/1510073.0 ..........92.71% 1400000/1510073.0 ..........93.37% 1410000/1510073.0 ..........94.04% 1420000/1510073.0 ..........94.70% 1430000/1510073.0 ..........95.36% 1440000/1510073.0 ..........96.02% 1450000/1510073.0 ..........96.68% 1460000/1510073.0 ..........97.35% 1470000/1510073.0 ..........98.01% 1480000/1510073.0 ..........98.67% 1490000/1510073.0 .........Hypergeometric contig-similarity network: 10306 contigs, 408800 edges (min:1.0max: 3e+02, threshold was 1) Saving network in file out/network.ntw (408800 lines).

------------------------------Calculating E-edges-------------------------------

------------------------------Calculating P-edges-------------------------------

---------------------------Generating Knowledge graph--------------------------- /public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /opt/conda/conda-bld/pytorch_1607370116979/work/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 Namespace(database='Caudoviridae', dropout=0, epochs=200, hidden=64, learning_rate=0.01, max_degree=3, model='gcn', weight_decay=0.0005) Running with cpu adj: (8687, 8687) features: (8687, 512) y: (8687,) (8687,) mask: (8687,) (8687,) x : tensor(indices=tensor([[ 0, 0, 0, ..., 8686, 8686, 8686], [ 511, 505, 471, ..., 28, 27, 8]]), values=tensor([0.0015, 0.0201, 0.0184, ..., 0.0420, 0.0200, 0.0180]), size=(8687, 512), nnz=378639, layout=torch.sparse_coo) sp: tensor(indices=tensor([[ 0, 1, 970, ..., 8685, 4635, 8686], [ 0, 0, 0, ..., 8685, 8686, 8686]]), values=tensor([0.2500, 0.3536, 0.0811, ..., 0.2000, 0.0786, 0.5000]), size=(8687, 8687), nnz=417603, layout=torch.sparse_coo) input dim: 512 output dim: 172 num_features_nonzero: 378639 0 21.82663917541504 0.010466468535986562 10 17.960010528564453 0.18710427703837706 20 15.184633255004883 0.42705775940043933 30 12.582104682922363 0.5862514536761856 40 10.426006317138672 0.7521643623207133 50 8.785109519958496 0.8494637550071069 60 7.515796184539795 0.8812508075978809 70 6.425203323364258 0.892363354438558 80 5.596924781799316 0.9126502132058405 90 4.869475364685059 0.91678511435586 100 4.286680221557617 0.9286729551621656 110 3.68833589553833 0.9306111900762373 120 3.26792049407959 0.9348753068871948 130 2.886105537414551 0.9362966791575139 140 2.5721871852874756 0.937588835766895 150 2.289077043533325 0.9384933453934617 160 2.036364793777466 0.9434035405091097 170 1.8060698509216309 0.9458586380669337 180 1.6724344491958618 0.9457294224059956 190 1.4728317260742188 0.9466339320325623 200 1.3537671566009521 0.953482362062282 210 1.248291254043579 0.9563251066029202 220 1.1189727783203125 0.953482362062282 230 1.014660120010376 0.9563251066029202 240 0.96391761302948 0.956066675281044 250 0.8997896909713745 0.9589094198216824 260 0.833203911781311 0.9582633415169919 270 0.7948960065841675 0.9594262824654348 280 0.7333246469497681 0.9564543222638584 290 0.6908550262451172 0.9617521643623207 300 0.6638907194137573 0.9625274583279494 310 0.6291097402572632 0.9625274583279494 320 0.6043024659156799 0.9613645173795063 330 0.5626105666160583 0.963302752293578 340 0.5590479969978333 0.9669207907998449 350 0.5435287952423096 0.9644656932420209 360 0.5341185331344604 0.9627858896498256 370 0.4978073239326477 0.9636903992763923 380 0.5041185021400452 0.9644656932420209 390 0.5001029372215271 0.9630443209717018 start combine network........ Creating Diamond database... Running Diamond... cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/VMR_based_on_ICTV.csv’ and ‘./database/VMR_based_on_ICTV.csv’ are the same file cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/reference_name_id.csv’ and ‘./database/reference_name_id.csv’ are the same file cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/ALL_genome_profile.csv’ and ‘./database/ALL_genome_profile.csv’ are the same file cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/ALL_protein.fasta’ and ‘./database/ALL_protein.fasta’ are the same file cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/ALL_gene_to_genomes.csv’ and ‘./database/ALL_gene_to_genomes.csv’ are the same file cp: ‘/public1/home/t6s000092/soft/PhaGCN2.0/database/taxonomic_label.csv’ and ‘./database/taxonomic_label.csv’ are the same file diamond v0.9.14.115 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1) Database file: database/ALL_protein.fasta Opening the database file... [0.01789s] Loading sequences... [0.590148s] Masking sequences... [2.51824s] Writing sequences... [0.209732s] Loading sequences... [3.3e-05s] Writing trailer... [0.003901s] Closing the input file... [0.000243s] Closing the database file... [0.000204s] Processed 355277 sequences, 89213859 letters. Total time = 3.34066s diamond v0.9.14.115 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU AGPL https://www.gnu.org/licenses/agpl.txt Check http://github.com/bbuchfink/diamond for updates.

CPU threads: 128

Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)

Target sequences to report alignments for: 25

Temporary directory: database Opening the database... [0.000183s] Opening the input file... [0.000517s] Opening the output file... [0.000259s] Loading query sequences... [0.389211s] Masking queries... [2.51574s] Building query seed set... [0.00069s] Algorithm: Double-indexed Building query histograms... [1.09677s] Allocating buffers... [0.00427s] Loading reference sequences... [0.20779s] Building reference histograms... [1.09783s] Allocating buffers... [0.003992s] Initializing temporary storage... [0.722539s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 0. Building reference index... [0.283269s] Building query index... [0.27484s] Building seed filter... [0.025209s] Searching alignments... [19.8627s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 1. Building reference index... [0.311584s] Building query index... [0.294934s] Building seed filter... [0.024621s] Searching alignments... [16.8565s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 2. Building reference index... [0.382433s] Building query index... [0.317564s] Building seed filter... [0.025139s] Searching alignments... [16.3031s] Processing query chunk 0, reference chunk 0, shape 0, index chunk 3. Building reference index... [0.274689s] Building query index... [0.265093s] Building seed filter... [0.026024s] Searching alignments... [15.9754s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 0. Building reference index... [0.267498s] Building query index... [0.266023s] Building seed filter... [0.025164s] Searching alignments... [14.4003s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 1. Building reference index... [0.298033s] Building query index... [0.292235s] Building seed filter... [0.024391s] Searching alignments... [14.6411s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 2. Building reference index... [0.313034s] Building query index... [0.301072s] Building seed filter... [0.024871s] Searching alignments... [14.4521s] Processing query chunk 0, reference chunk 0, shape 1, index chunk 3. Building reference index... [0.262411s] Building query index... [0.256059s] Building seed filter... [0.024758s] Searching alignments... [14.3917s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 0. Building reference index... [0.265218s] Building query index... [0.266968s] Building seed filter... [0.02445s] Searching alignments... [15.2328s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 1. Building reference index... [0.302659s] Building query index... [0.300121s] Building seed filter... [0.026636s] Searching alignments... [15.3422s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 2. Building reference index... [0.392863s] Building query index... [0.326037s] Building seed filter... [0.026308s] Searching alignments... [15.3762s] Processing query chunk 0, reference chunk 0, shape 2, index chunk 3. Building reference index... [0.285002s] Building query index... [0.309236s] Building seed filter... [0.029408s] Searching alignments... [15.3907s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 0. Building reference index... [0.281797s] Building query index... [0.265508s] Building seed filter... [0.025928s] Searching alignments... [15.0185s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 1. Building reference index... [0.305271s] Building query index... [0.297462s] Building seed filter... [0.024591s] Searching alignments... [15.0288s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 2. Building reference index... [0.319434s] Building query index... [0.316996s] Building seed filter... [0.026815s] Searching alignments... [14.8484s] Processing query chunk 0, reference chunk 0, shape 3, index chunk 3. Building reference index... [0.274905s] Building query index... [0.260729s] Building seed filter... [0.025586s] Searching alignments... [15.0292s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 0. Building reference index... [0.259619s] Building query index... [0.259s] Building seed filter... [0.024649s] Searching alignments... [14.8923s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 1. Building reference index... [0.298625s] Building query index... [0.292324s] Building seed filter... [0.025175s] Searching alignments... [15.8937s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 2. Building reference index... [0.319582s] Building query index... [0.308277s] Building seed filter... [0.025682s] Searching alignments... [14.6923s] Processing query chunk 0, reference chunk 0, shape 4, index chunk 3. Building reference index... [0.277406s] Building query index... [0.261891s] Building seed filter... [0.026275s] Searching alignments... [14.929s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 0. Building reference index... [0.311551s] Building query index... [0.27793s] Building seed filter... [0.026442s] Searching alignments... [15.1107s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 1. Building reference index... [0.304051s] Building query index... [0.299856s] Building seed filter... [0.025224s] Searching alignments... [14.9284s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 2. Building reference index... [0.320753s] Building query index... [0.307826s] Building seed filter... [0.024642s] Searching alignments... [14.779s] Processing query chunk 0, reference chunk 0, shape 5, index chunk 3. Building reference index... [0.261098s] Building query index... [0.257086s] Building seed filter... [0.026662s] Searching alignments... [14.8119s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 0. Building reference index... [0.262926s] Building query index... [0.263373s] Building seed filter... [0.025155s] Searching alignments... [14.5588s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 1. Building reference index... [0.29295s] Building query index... [0.292022s] Building seed filter... [0.025948s] Searching alignments... [14.6158s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 2. Building reference index... [0.311916s] Building query index... [0.308026s] Building seed filter... [0.025351s] Searching alignments... [14.5627s] Processing query chunk 0, reference chunk 0, shape 6, index chunk 3. Building reference index... [0.260355s] Building query index... [0.269366s] Building seed filter... [0.02624s] Searching alignments... [14.3568s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 0. Building reference index... [0.263429s] Building query index... [0.259393s] Building seed filter... [0.023786s] Searching alignments... [14.8437s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 1. Building reference index... [0.297029s] Building query index... [0.291271s] Building seed filter... [0.025577s] Searching alignments... [15.0007s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 2. Building reference index... [0.315557s] Building query index... [0.309132s] Building seed filter... [0.025421s] Searching alignments... [14.7984s] Processing query chunk 0, reference chunk 0, shape 7, index chunk 3. Building reference index... [0.266881s] Building query index... [0.256021s] Building seed filter... [0.024408s] Searching alignments... [14.8613s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 0. Building reference index... [0.262579s] Building query index... [0.257655s] Building seed filter... [0.025664s] Searching alignments... [15.2295s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 1. Building reference index... [0.298764s] Building query index... [0.299266s] Building seed filter... [0.025966s] Searching alignments... [15.168s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 2. Building reference index... [0.319456s] Building query index... [0.308729s] Building seed filter... [0.024608s] Searching alignments... [15.2676s] Processing query chunk 0, reference chunk 0, shape 8, index chunk 3. Building reference index... [0.270841s] Building query index... [0.264207s] Building seed filter... [0.024932s] Searching alignments... [15.3816s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 0. Building reference index... [0.263932s] Building query index... [0.262825s] Building seed filter... [0.025301s] Searching alignments... [14.4513s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 1. Building reference index... [0.301304s] Building query index... [0.291002s] Building seed filter... [0.025083s] Searching alignments... [14.5634s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 2. Building reference index... [0.311886s] Building query index... [0.307125s] Building seed filter... [0.024389s] Searching alignments... [14.4054s] Processing query chunk 0, reference chunk 0, shape 9, index chunk 3. Building reference index... [0.26674s] Building query index... [0.256262s] Building seed filter... [0.02536s] Searching alignments... [14.492s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 0. Building reference index... [0.265715s] Building query index... [0.261112s] Building seed filter... [0.025149s] Searching alignments... [15.1314s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 1. Building reference index... [0.305626s] Building query index... [0.294904s] Building seed filter... [0.025342s] Searching alignments... [15.2309s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 2. Building reference index... [0.320872s] Building query index... [0.316694s] Building seed filter... [0.024998s] Searching alignments... [15.1713s] Processing query chunk 0, reference chunk 0, shape 10, index chunk 3. Building reference index... [0.260531s] Building query index... [0.258487s] Building seed filter... [0.025103s] Searching alignments... [15.3755s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 0. Building reference index... [0.26193s] Building query index... [0.284297s] Building seed filter... [0.025176s] Searching alignments... [14.6723s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 1. Building reference index... [0.292889s] Building query index... [0.289912s] Building seed filter... [0.024455s] Searching alignments... [14.658s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 2. Building reference index... [0.309579s] Building query index... [0.302704s] Building seed filter... [0.02483s] Searching alignments... [14.5503s] Processing query chunk 0, reference chunk 0, shape 11, index chunk 3. Building reference index... [0.260402s] Building query index... [0.254705s] Building seed filter... [0.022964s] Searching alignments... [14.6408s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 0. Building reference index... [0.264545s] Building query index... [0.260019s] Building seed filter... [0.025533s] Searching alignments... [14.7112s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 1. Building reference index... [0.296742s] Building query index... [0.294477s] Building seed filter... [0.024755s] Searching alignments... [14.8003s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 2. Building reference index... [0.320262s] Building query index... [0.308086s] Building seed filter... [0.025445s] Searching alignments... [14.7036s] Processing query chunk 0, reference chunk 0, shape 12, index chunk 3. Building reference index... [0.26371s] Building query index... [0.254884s] Building seed filter... [0.030946s] Searching alignments... [14.6497s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 0. Building reference index... [0.269167s] Building query index... [0.255584s] Building seed filter... [0.026428s] Searching alignments... [14.9061s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 1. Building reference index... [0.300151s] Building query index... [0.30065s] Building seed filter... [0.024835s] Searching alignments... [15.0832s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 2. Building reference index... [0.310774s] Building query index... [0.306496s] Building seed filter... [0.025767s] Searching alignments... [14.9718s] Processing query chunk 0, reference chunk 0, shape 13, index chunk 3. Building reference index... [0.267171s] Building query index... [0.255494s] Building seed filter... [0.025212s] Searching alignments... [14.8243s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 0. Building reference index... [0.264856s] Building query index... [0.266763s] Building seed filter... [0.025114s] Searching alignments... [15.0865s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 1. Building reference index... [0.303294s] Building query index... [0.29382s] Building seed filter... [0.023943s] Searching alignments... [15.2816s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 2. Building reference index... [0.315379s] Building query index... [0.305534s] Building seed filter... [0.024536s] Searching alignments... [15.0456s] Processing query chunk 0, reference chunk 0, shape 14, index chunk 3. Building reference index... [0.256008s] Building query index... [0.251722s] Building seed filter... [0.023423s] Searching alignments... [15.2821s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 0. Building reference index... [0.259813s] Building query index... [0.257348s] Building seed filter... [0.025391s] Searching alignments... [15.0535s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 1. Building reference index... [0.299897s] Building query index... [0.287854s] Building seed filter... [0.02488s] Searching alignments... [14.5973s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 2. Building reference index... [0.306047s] Building query index... [0.30063s] Building seed filter... [0.025421s] Searching alignments... [14.6525s] Processing query chunk 0, reference chunk 0, shape 15, index chunk 3. Building reference index... [0.258485s] Building query index... [0.258179s] Building seed filter... [0.024416s] Searching alignments... [14.9748s] Deallocating buffers... [0.000819s] Computing alignments... [134.476s] Deallocating reference... [0.001597s] Loading reference sequences... [3.8e-05s] Deallocating buffers... [0.001419s] Deallocating queries... [0.000802s] Loading query sequences... [3.1e-05s] Closing the input file... [0.00018s] Closing the output file... [0.000192s] Closing the database file... [0.000154s] Total time = 1141.65s Reported 6420544 pairwise alignments, 6431184 HSPs. 354845 queries aligned. rm: cannot remove ‘validation/’: No such file or directory rm: cannot remove ‘stride50_val/’: No such file or directory rm: cannot remove ‘int_val/’: No such file or directory rm: cannot remove ‘filtered_val/’: No such file or directory rm: cannot remove ‘dataset/’: No such file or directory rm: cannot remove ‘split_long_reads_val/’: No such file or directory /public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /opt/conda/conda-bld/pytorch_1607370116979/work/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 Capturing compressed features Running with cpu .................................................. 1M .................................................. 2M .................................................. 3M .................................................. 4M .................................................. 5M .................................................. 6M ................ [mclIO] writing <out/merged.mci> ....................................... [mclIO] wrote native interchange 350184x350184 matrix with 8169360 entries to stream <out/merged.mci> [mclIO] wrote 350184 tab entries to stream <out/merged_mcxload.tab> [mcxload] tab has 350184 entries [mclIO] reading <out/merged.mci> ....................................... [mclIO] read native interchange 350184x350184 matrix with 8169360 entries [mcl] pid 459784 ite ------------------- chaos time hom(avg,lo,hi) m-ie m-ex i-ex fmv 1 ................... 76.85 4.26 1.08/0.00/7.54 2.52 2.42 2.42 0 2 ................... 77.98 17.33 0.84/0.02/4.14 2.21 0.94 2.28 3 3 ................... 38.20 12.67 0.79/0.08/6.36 1.69 0.74 1.68 1 4 ................... 30.21 5.77 0.78/0.06/5.70 1.44 0.70 1.17 0 5 ................... 21.30 2.58 0.78/0.09/4.63 1.26 0.68 0.80 0 6 ................... 17.64 1.25 0.79/0.13/4.85 1.11 0.70 0.56 0 7 ................... 8.88 0.76 0.80/0.15/3.04 1.04 0.75 0.42 0 8 ................... 7.03 0.55 0.83/0.17/1.83 1.01 0.79 0.33 0 9 ................... 4.78 0.43 0.86/0.16/1.30 1.01 0.80 0.26 0 10 ................... 4.67 0.35 0.90/0.20/1.72 1.00 0.81 0.21 0 11 ................... 4.81 0.29 0.93/0.21/1.16 1.00 0.82 0.17 0 12 ................... 4.74 0.26 0.96/0.20/1.16 1.00 0.83 0.14 0 13 ................... 3.62 0.22 0.97/0.26/1.11 1.00 0.86 0.12 0 14 ................... 5.15 0.20 0.98/0.21/1.09 1.00 0.90 0.11 0 15 ................... 5.19 0.19 0.99/0.27/1.01 1.00 0.94 0.11 0 16 ................... 3.26 0.19 0.99/0.26/1.00 1.00 0.96 0.10 0 17 ................... 3.94 0.17 1.00/0.22/1.00 1.00 0.97 0.10 0 18 ................... 4.27 0.18 1.00/0.39/1.00 1.00 0.98 0.10 0 19 ................... 3.33 0.18 1.00/0.26/1.00 1.00 0.99 0.10 0 20 ................... 3.21 0.17 1.00/0.34/1.00 1.00 0.99 0.10 0 21 ................... 1.71 0.17 1.00/0.39/1.00 1.00 1.00 0.09 0 22 ................... 3.89 0.18 1.00/0.23/1.00 1.00 1.00 0.09 0 23 ................... 1.95 0.17 1.00/0.67/1.00 1.00 1.00 0.09 0 24 ................... 0.75 0.18 1.00/0.55/1.00 1.00 1.00 0.09 0 25 ................... 0.37 0.16 1.00/0.79/1.00 1.00 1.00 0.09 0 26 ................... 0.25 0.17 1.00/0.76/1.00 1.00 1.00 0.09 0 27 ................... 0.15 0.17 1.00/0.85/1.00 1.00 1.00 0.09 0 28 ................... 0.02 0.18 1.00/0.98/1.00 1.00 1.00 0.09 0 29 ................... 0.00 0.17 1.00/1.00/1.00 1.00 1.00 0.09 0 30 ................... 0.00 0.18 1.00/1.00/1.00 1.00 1.00 0.09 0 [mcl] jury pruning marks: <99,98,99>, out of 100 [mcl] jury pruning synopsis: <98.8 or marvelous> (cf -scheme, -do log) [mcl] output is in out/merged_mcl20.clusters [mcl] 30656 clusters found [mcl] output is in out/merged_mcl20.clusters

Please cite: Stijn van Dongen, Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht, May 2000. ( http://www.library.uu.nl/digiarchief/dip/diss/1895620/full.pdf or http://micans.org/mcl/lit/svdthesis.pdf.gz) OR Stijn van Dongen, A cluster algorithm for graphs. Technical Report INS-R0010, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000. ( http://www.cwi.nl/ftp/CWIreports/INS/INS-R0010.ps.Z or http://micans.org/mcl/lit/INS-R0010.ps.Z)

/public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/Bio/Seq.py:2338: BiopythonWarning: Partial codon, len(sequence) not a multiple of three. Explicitly trim the sequence or add trailing N before translation. This may become an error in future. BiopythonWarning, run_KnowledgeGraph.py:410: RuntimeWarning: divide by zero encountered in log10 sig = min(max_sig, np.nan_to_num(-np.log10(pval) - logT))

---------------------------------Diamond BLASTp--------------------------------- Creating Diamond database and running Diamond... Creating Diamond database... Running Diamond...

-------------------------------Protein clustering------------------------------- Loading proteins... Running MCL... Building the cluster and profiles (this may take some time...) Using MCL to generate PCs. Saving files Read 10306 entries from out/pcs_contigs.csv Read 336924 entries (dropped 5239 singletons) from out/Cyber_profiles.csv .......... 0.66% 10000/1510073.0 .......... 1.32% 20000/1510073.0 .......... 1.99% 30000/1510073.0 .......... 2.65% 40000/1510073.0 .......... 3.31% 50000/1510073.0 .......... 3.97% 60000/1510073.0 .......... 4.64% 70000/1510073.0 .......... 5.30% 80000/1510073.0 .......... 5.96% 90000/1510073.0 .......... 6.62% 100000/1510073.0 .......... 7.28% 110000/1510073.0 .......... 7.95% 120000/1510073.0 .......... 8.61% 130000/1510073.0 .......... 9.27% 140000/1510073.0 .......... 9.93% 150000/1510073.0 ..........10.60% 160000/1510073.0 ..........11.26% 170000/1510073.0 ..........11.92% 180000/1510073.0 ..........12.58% 190000/1510073.0 ..........13.24% 200000/1510073.0 ..........13.91% 210000/1510073.0 ..........14.57% 220000/1510073.0 ..........15.23% 230000/1510073.0 ..........15.89% 240000/1510073.0 ..........16.56% 250000/1510073.0 ..........17.22% 260000/1510073.0 ..........17.88% 270000/1510073.0 ..........18.54% 280000/1510073.0 ..........19.20% 290000/1510073.0 ..........19.87% 300000/1510073.0 ..........20.53% 310000/1510073.0 ..........21.19% 320000/1510073.0 ..........21.85% 330000/1510073.0 ..........22.52% 340000/1510073.0 ..........23.18% 350000/1510073.0 ..........23.84% 360000/1510073.0 ..........24.50% 370000/1510073.0 ..........25.16% 380000/1510073.0 ..........25.83% 390000/1510073.0 ..........26.49% 400000/1510073.0 ..........27.15% 410000/1510073.0 ..........27.81% 420000/1510073.0 ..........28.48% 430000/1510073.0 ..........29.14% 440000/1510073.0 ..........29.80% 450000/1510073.0 ..........30.46% 460000/1510073.0 ..........31.12% 470000/1510073.0 ..........31.79% 480000/1510073.0 ..........32.45% 490000/1510073.0 ..........33.11% 500000/1510073.0 ..........33.77% 510000/1510073.0 ..........34.44% 520000/1510073.0 ..........35.10% 530000/1510073.0 ..........35.76% 540000/1510073.0 ..........36.42% 550000/1510073.0 ..........37.08% 560000/1510073.0 ..........37.75% 570000/1510073.0 ..........38.41% 580000/1510073.0 ..........39.07% 590000/1510073.0 ..........39.73% 600000/1510073.0 ..........40.40% 610000/1510073.0 ..........41.06% 620000/1510073.0 ..........41.72% 630000/1510073.0 ..........42.38% 640000/1510073.0 ..........43.04% 650000/1510073.0 ..........43.71% 660000/1510073.0 ..........44.37% 670000/1510073.0 ..........45.03% 680000/1510073.0 ..........45.69% 690000/1510073.0 ..........46.36% 700000/1510073.0 ..........47.02% 710000/1510073.0 ..........47.68% 720000/1510073.0 ..........48.34% 730000/1510073.0 ..........49.00% 740000/1510073.0 ..........49.67% 750000/1510073.0 ..........50.33% 760000/1510073.0 ..........50.99% 770000/1510073.0 ..........51.65% 780000/1510073.0 ..........52.32% 790000/1510073.0 ..........52.98% 800000/1510073.0 ..........53.64% 810000/1510073.0 ..........54.30% 820000/1510073.0 ..........54.96% 830000/1510073.0 ..........55.63% 840000/1510073.0 ..........56.29% 850000/1510073.0 ..........56.95% 860000/1510073.0 ..........57.61% 870000/1510073.0 ..........58.28% 880000/1510073.0 ..........58.94% 890000/1510073.0 ..........59.60% 900000/1510073.0 ..........60.26% 910000/1510073.0 ..........60.92% 920000/1510073.0 ..........61.59% 930000/1510073.0 ..........62.25% 940000/1510073.0 ..........62.91% 950000/1510073.0 ..........63.57% 960000/1510073.0 ..........64.24% 970000/1510073.0 ..........64.90% 980000/1510073.0 ..........65.56% 990000/1510073.0 ..........66.22% 1000000/1510073.0 ..........66.88% 1010000/1510073.0 ..........67.55% 1020000/1510073.0 ..........68.21% 1030000/1510073.0 ..........68.87% 1040000/1510073.0 ..........69.53% 1050000/1510073.0 ..........70.20% 1060000/1510073.0 ..........70.86% 1070000/1510073.0 ..........71.52% 1080000/1510073.0 ..........72.18% 1090000/1510073.0 ..........72.84% 1100000/1510073.0 ..........73.51% 1110000/1510073.0 ..........74.17% 1120000/1510073.0 ..........74.83% 1130000/1510073.0 ..........75.49% 1140000/1510073.0 ..........76.16% 1150000/1510073.0 ..........76.82% 1160000/1510073.0 ..........77.48% 1170000/1510073.0 ..........78.14% 1180000/1510073.0 ..........78.80% 1190000/1510073.0 ..........79.47% 1200000/1510073.0 ..........80.13% 1210000/1510073.0 ..........80.79% 1220000/1510073.0 ..........81.45% 1230000/1510073.0 ..........82.12% 1240000/1510073.0 ..........82.78% 1250000/1510073.0 ..........83.44% 1260000/1510073.0 ..........84.10% 1270000/1510073.0 ..........84.76% 1280000/1510073.0 ..........85.43% 1290000/1510073.0 ..........86.09% 1300000/1510073.0 ..........86.75% 1310000/1510073.0 ..........87.41% 1320000/1510073.0 ..........88.08% 1330000/1510073.0 ..........88.74% 1340000/1510073.0 ..........89.40% 1350000/1510073.0 ..........90.06% 1360000/1510073.0 ..........90.72% 1370000/1510073.0 ..........91.39% 1380000/1510073.0 ..........92.05% 1390000/1510073.0 ..........92.71% 1400000/1510073.0 ..........93.37% 1410000/1510073.0 ..........94.04% 1420000/1510073.0 ..........94.70% 1430000/1510073.0 ..........95.36% 1440000/1510073.0 ..........96.02% 1450000/1510073.0 ..........96.68% 1460000/1510073.0 ..........97.35% 1470000/1510073.0 ..........98.01% 1480000/1510073.0 ..........98.67% 1490000/1510073.0 .........Hypergeometric contig-similarity network: 10306 contigs, 408800 edges (min:1.0max: 3e+02, threshold was 1) Saving network in file out/network.ntw (408800 lines).

------------------------------Calculating E-edges-------------------------------

------------------------------Calculating P-edges-------------------------------

---------------------------Generating Knowledge graph--------------------------- /public1/home/t6s000092/.conda/envs/phagcn2/lib/python3.6/site-packages/torch/cuda/init.py:52: UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at /opt/conda/conda-bld/pytorch_1607370116979/work/c10/cuda/CUDAFunctions.cpp:100.) return torch._C._cuda_getDeviceCount() > 0 Namespace(database='Caudoviridae', dropout=0, epochs=200, hidden=64, learning_rate=0.01, max_degree=3, model='gcn', weight_decay=0.0005) Running with cpu adj: (8687, 8687) features: (8687, 512) y: (8687,) (8687,) mask: (8687,) (8687,) x : tensor(indices=tensor([[ 0, 0, 0, ..., 8686, 8686, 8686], [ 511, 505, 471, ..., 28, 27, 8]]), values=tensor([0.0015, 0.0201, 0.0184, ..., 0.0420, 0.0200, 0.0180]), size=(8687, 512), nnz=378639, layout=torch.sparse_coo) sp: tensor(indices=tensor([[ 0, 1, 970, ..., 8685, 4635, 8686], [ 0, 0, 0, ..., 8685, 8686, 8686]]), values=tensor([0.2500, 0.3536, 0.0811, ..., 0.2000, 0.0786, 0.5000]), size=(8687, 8687), nnz=417603, layout=torch.sparse_coo) input dim: 512 output dim: 172 num_features_nonzero: 378639 0 21.82663917541504 0.010466468535986562 10 17.960010528564453 0.18710427703837706 20 15.184633255004883 0.42705775940043933 30 12.582104682922363 0.5862514536761856 40 10.426006317138672 0.7521643623207133 50 8.785109519958496 0.8494637550071069 60 7.515796184539795 0.8812508075978809 70 6.425203323364258 0.892363354438558 80 5.596924781799316 0.9126502132058405 90 4.869475364685059 0.91678511435586 100 4.286680221557617 0.9286729551621656 110 3.68833589553833 0.9306111900762373 120 3.26792049407959 0.9348753068871948 130 2.886105537414551 0.9362966791575139 140 2.5721871852874756 0.937588835766895 150 2.289077043533325 0.9384933453934617 160 2.036364793777466 0.9434035405091097 170 1.8060698509216309 0.9458586380669337 180 1.6724344491958618 0.9457294224059956 190 1.4728317260742188 0.9466339320325623 200 1.3537671566009521 0.953482362062282 210 1.248291254043579 0.9563251066029202 220 1.1189727783203125 0.953482362062282 230 1.014660120010376 0.9563251066029202 240 0.96391761302948 0.956066675281044 250 0.8997896909713745 0.9589094198216824 260 0.833203911781311 0.9582633415169919 270 0.7948960065841675 0.9594262824654348 280 0.7333246469497681 0.9564543222638584 290 0.6908550262451172 0.9617521643623207 300 0.6638907194137573 0.9625274583279494 310 0.6291097402572632 0.9625274583279494 320 0.6043024659156799 0.9613645173795063 330 0.5626105666160583 0.963302752293578 340 0.5590479969978333 0.9669207907998449 350 0.5435287952423096 0.9644656932420209 360 0.5341185331344604 0.9627858896498256 370 0.4978073239326477 0.9636903992763923 380 0.5041185021400452 0.9644656932420209 390 0.5001029372215271 0.9630443209717018 start combine network........ Creating Diamond database... Running Diamond...

yuanwenguang666 commented 2 months ago

Thank you for your questions.

There are two reasons why contigs may be skipped. The first reason is that the length of the contigs is less than 1700bp. The second reason is that the contigs contain gaps, such as "ATGCNNNNNNNTACG". I think you can check these contigs from these.