bacpop / PopPUNK

PopPUNK 👨‍🎤 (POPulation Partitioning Using Nucleotide Kmers)
https://www.bacpop.org/poppunk
Apache License 2.0
89 stars 18 forks source link

AttributeError: 'numpy.uint64' object has no attribute 'out_edges' #157

Closed flass closed 3 years ago

flass commented 3 years ago

Hi John and Nick,

I think I have a bug for you!

Versions I am using PopPUNK v2.3.0 with pp-sketchlib 1.6.2, as provided by a conda environment built with

conda create -n poppunk230 -c defaults -c conda-forge -c bioconda poppunk==2.3.0 pp-sketchlib==1.6.2

Command used and output returned the command was:

poppunk --fit-model dbscan --ref-db 7kVc --output 7kVc --threads 8 --qc-filter prune --length-range 3000000 5000000 --max-a-dist 1 --plot-fit 5

this comes after I ran:

poppunk --create-db --r-files 7kVibrioCholerae_genome_fasta_list.tab --output 7kVc --threads 8  --min-k 15 --max-k 35 --full-db --plot-fit 5   --qc-filter prune --length-range 3000000 5000000 --max-a-dist 1

but this previous command was run with PopPUNK v2.2.0 with pp-sketchlib 1.5.1

here is the combined stdout and stderr streams:

/lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/lib/python3.8/site-packages/graph_tool/draw/cairo_draw.py:1494: RuntimeWarning: Error importing Gtk module: No modu
le named 'gi'; GTK+ drawing will not work.
  warnings.warn(msg, RuntimeWarning)
PopPUNK (POPulation Partitioning Using Nucleotide Kmers)
        (with backend: sketchlib v1.6.2
         sketchlib: /lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/lib/python3.8/site-packages/pp_sketchlib.cpython-38-x86_64-linux-gnu.so)

Graph-tools OpenMP parallelisation enabled: with 8 threads
Mode: Fitting dbscan model to reference database

Fit summary:
        Number of clusters      22
        Number of datapoints    100000
        Number of assignments   69710

Scaled component means
        [0.66464609 0.0165393 ]
        [0.36258522 0.86229378]
        [0.05931121 0.4089773 ]
        [0.03545891 0.2598196 ]
        [0.01223866 0.12545162]
        [0.00666925 0.11610567]
        [0.00817923 0.07951489]
        [0.00074281 0.0862769 ]
        [7.44675781e-05 1.15323048e-02]
        [8.16615720e-05 9.30478238e-03]
        [0.00066636 0.06999604]
        [0.00018949 0.0054101 ]
        [0.00018632 0.00332961]
        [0.00011474 0.00099416]
        [1.12715396e-04 1.76445446e-05]
        [0.00011157 0.00054075]
        [8.79290092e-05 2.44285911e-04]
        [0.00068903 0.06519143]
        [0.00075784 0.05114995]
        [0.0005447  0.06054742]
        [0.00053338 0.03605872]
        [0.00080647 0.02198807]

Network summary:
        Components      2539
        Density 0.0143
        Transitivity    0.6566
        Score   0.6472
Traceback (most recent call last):
  File "/lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/bin/poppunk", line 10, in <module>
    sys.exit(main())
  File "/lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/lib/python3.8/site-packages/PopPUNK/__main__.py", line 498, in main
    extractReferences(genomeNetwork, refList, output, threads = args.threads)
  File "/lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/lib/python3.8/site-packages/PopPUNK/network.py", line 228, in extractReferences
    vertex_list, edge_list = gt.shortest_path(G, check[i], check[j])
  File "/lustre/scratch118/infgen/team216/fl4/miniconda3/envs/poppunk230/lib/python3.8/site-packages/graph_tool/topology/__init__.py", line 2153, in shortest_path
    for e in v.in_edges() if g.is_directed() else v.out_edges():
AttributeError: 'numpy.uint64' object has no attribute 'out_edges'

the --create-db run produced the files dated 'Mar 1'; the halted --fit-model run produced the files dated 'Mar 2' :

ls -ltr 7kVc
total 1042808
-rw-r--r-- 1 me mygroup          2066 Mar  1 17:09 7kVc_qcreport.txt
-rw-r--r-- 1 me mygroup     821192512 Mar  1 17:09 7kVc.h5
-rw-r--r-- 1 me mygroup         17028 Mar  1 17:12 fit_example_1.pdf
-rw-r--r-- 1 me mygroup         17033 Mar  1 17:12 fit_example_3.pdf
-rw-r--r-- 1 me mygroup         17233 Mar  1 17:12 fit_example_2.pdf
-rw-r--r-- 1 me mygroup         16304 Mar  1 17:12 fit_example_5.pdf
-rw-r--r-- 1 me mygroup         17160 Mar  1 17:12 fit_example_4.pdf
-rw-r--r-- 1 me mygroup        208578 Mar  1 17:12 7kVc.dists.pkl
-rw-r--r-- 1 me mygroup     232227248 Mar  1 17:12 7kVc.dists.npy
-rw-r--r-- 1 me mygroup        129789 Mar  2 17:36 7kVc_dbscan.png
-rw-r--r-- 1 me mygroup      12593393 Mar  2 17:36 7kVc_fit.pkl
-rw-r--r-- 1 me mygroup          2816 Mar  2 17:36 7kVc_fit.npz
-rw-r--r-- 1 me mygroup       1138704 Mar  2 17:37 7kVc_graph.gt
-rw-r--r-- 1 me mygroup        216900 Mar  2 17:37 7kVc_clusters.csv

Describe the bug

the program stopped with an error, see logs above.

I assume the run is not complete! is it? can you help please?

Cheers,

Florent

johnlees commented 3 years ago

Hi Florent, This might be a bug with an older version of graph-tool, which version do you have? I think you need at least 2.35:

conda install graph-tool>=2.35

If that's not it, I can take another look

flass commented 3 years ago

Hi John,

yes I installed graph-tool==2.37 (instead of the v2.29 that was there before) and apparently it solved the issue. thanks!

Florent