labgem / PPanGGOLiN

Build a partitioned pangenome graph from microbial genomes
https://ppanggolin.readthedocs.io
Other
242 stars 30 forks source link

PPanGGOLiN hangs when partitioning does not work #114

Closed apcamargo closed 1 year ago

apcamargo commented 1 year ago

I'm evaluating PPanGGOLiN for very short MGEs and it is pretty common for it to hang when evaluating different number of partitions (which I assume is because the low number of genomes or the little variation between them).

2023-05-07 09:15:27 utils.py:l116 INFO  Command: /clusterfs/jgi/groups/science/homes/antoniop.camargo/.micromamba/envs/ppanggolin/bin/ppanggolin partition -c 64 --krange 3 20 --chunk_size 100000 -p TEST/sampled_complete_sequences_ppanggolin_50/sampled_complete_sequences.h5 --verbose 2
2023-05-07 09:15:27 utils.py:l117 INFO  PPanGGOLiN version: 1.2.105
2023-05-07 09:15:27 readBinaries.py:l71 INFO    Getting the current pangenome status
2023-05-07 09:15:27 readBinaries.py:l505 INFO   Reading pangenome annotations...
100%|███████████████████████████████████████████████████| 14954/14954 [00:00<00:00, 443905.46gene/s]
100%|██████████████████████████████████████████████████████| 214/214 [00:00<00:00, 917.56organism/s]
2023-05-07 09:15:27 readBinaries.py:l519 INFO   Reading pangenome gene families...
100%|████████████████████████████████████████████| 14954/14954 [00:00<00:00, 362431.22gene family/s]
100%|████████████████████████████████████████████████| 135/135 [00:00<00:00, 241772.43gene family/s]
2023-05-07 09:15:27 readBinaries.py:l527 INFO   Reading the neighbors graph edges...
100%|███████████████████████████████████████| 14716/14716 [00:00<00:00, 382432.00contig adjacency/s]
2023-05-07 09:15:27 partition.py:l482 INFO  Estimating the optimal number of partitions...
2023-05-07 09:15:27 partition.py:l254 DEBUG Writing nem_file.str nem_file.index nem_file.nei and nem_file.dat files
  0%|                                                      | 0/19 [00:00<?, ?Number of partitions/s]2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 2, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_2.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_2', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 3, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_3.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_3', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 4, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_4.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_4', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 6, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_6.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_6', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 5, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_5.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_5', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 7, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_7.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_7', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 8, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_8.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_8', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 11, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_11.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_11', 42]
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 9, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_9.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_9', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 10, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_10.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_10', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 12, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_12.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_12', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 13, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_13.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_13', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 15, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_15.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_15', 42]
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 14, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_14.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_14', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 16, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_16.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_16', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 17, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_17.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_17', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 18, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_18.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_18', 42]
2023-05-07 09:15:27 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 19, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_19.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_19', 42]
2023-05-07 09:15:27 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:15:27 partition.py:l90 DEBUG  [b'/tmp/tmpix2newx6/eval_partitions/nem_file', 20, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpix2newx6/eval_partitions/nem_file_init_20.m', b'/tmp/tmpix2newx6/eval_partitions/nem_file_20', 42]
2023-05-07 09:15:27 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:27 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_18.uf
2023-05-07 09:15:27 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_18.log
  5%|██▍                                           | 1/19 [00:00<00:02,  7.06Number of partitions/s]2023-05-07 09:15:27 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:27 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
 26%|████████████                                  | 5/19 [00:00<00:00, 22.24Number of partitions/s]2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_12.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_12.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_13.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_13.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_14.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_14.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_15.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_15.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_16.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_16.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_17.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_17.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_19.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_19.log
 79%|███████████████████████████████████▌         | 15/19 [00:00<00:00, 53.35Number of partitions/s]2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l113 DEBUG No NEM output file found: /tmp/tmpix2newx6/eval_partitions/nem_file_20.uf
2023-05-07 09:15:28 partition.py:l166 DEBUG partitioning did not work (the number of organisms used is probably too low), see logs here to obtain more details /tmp/tmpix2newx6/eval_partitions/nem_file_20.log
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:15:28 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:15:28 partition.py:l108 DEBUG Reading NEM results...
 95%|██████████████████████████████████████████▋  | 18/19 [00:19<00:00, 53.35Number of partitions/s]

In the NEM log file we can see that this happened because the NEM failed:

FATAL ERROR, NEM criteria reach infinite value

The behaviour (hanging when the error happens) makes it difficult to run PPanGGOLiN programatically across multiple pangenomes. If some of the NEM iterations fail, I'd expect that the software would evaluate only the ones that were successful.

Removing identical sequences solved this specific case for me, but I'm not sure this solution is 100% bulletproof.

apcamargo commented 1 year ago

With a single CPU the program exits after a segmentation fault:

2023-05-07 09:45:31 utils.py:l116 INFO  Command: /clusterfs/jgi/groups/science/homes/antoniop.camargo/.micromamba/envs/ppanggolin/bin/ppanggolin partition -c 1 --krange 3 20 --chunk_size 100000 -p TEST/sampled_complete_sequences_ppanggolin_50/sampled_complete_sequences.h5 --verbose 2
2023-05-07 09:45:31 utils.py:l117 INFO  PPanGGOLiN version: 1.2.105
2023-05-07 09:45:31 readBinaries.py:l71 INFO    Getting the current pangenome status
2023-05-07 09:45:31 readBinaries.py:l505 INFO   Reading pangenome annotations...
100%|███████████████████████████████████████████████████| 14954/14954 [00:00<00:00, 439957.23gene/s]
100%|██████████████████████████████████████████████████████| 214/214 [00:00<00:00, 916.01organism/s]
2023-05-07 09:45:31 readBinaries.py:l519 INFO   Reading pangenome gene families...
100%|████████████████████████████████████████████| 14954/14954 [00:00<00:00, 364268.79gene family/s]
100%|████████████████████████████████████████████████| 136/136 [00:00<00:00, 243667.38gene family/s]
2023-05-07 09:45:31 readBinaries.py:l527 INFO   Reading the neighbors graph edges...
100%|███████████████████████████████████████| 14717/14717 [00:00<00:00, 377536.36contig adjacency/s]
2023-05-07 09:45:31 partition.py:l482 INFO  Estimating the optimal number of partitions...
2023-05-07 09:45:31 partition.py:l254 DEBUG Writing nem_file.str nem_file.index nem_file.nei and nem_file.dat files
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 2, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_2.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_2', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 3, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_3.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_3', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 4, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_4.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_4', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 5, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_5.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_5', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 6, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_6.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_6', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 7, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_7.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_7', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 8, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_8.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_8', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 9, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_9.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_9', 42]
2023-05-07 09:45:31 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:31 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:31 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:31 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:31 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 10, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_10.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_10', 42]
2023-05-07 09:45:32 partition.py:l104 DEBUG After running NEM...
2023-05-07 09:45:32 partition.py:l108 DEBUG Reading NEM results...
2023-05-07 09:45:32 partition.py:l55 DEBUG  run_partitioning...
2023-05-07 09:45:32 partition.py:l89 DEBUG  Running NEM...
2023-05-07 09:45:32 partition.py:l90 DEBUG  [b'/tmp/tmpov9ifeah/eval_partitions/nem_file', 11, b'nem', 0, b'clas', 0.01, b'fuzzy', 10, True, b'bern', b'pk', b'sk_', 2, b'/tmp/tmpov9ifeah/eval_partitions/nem_file_init_11.m', b'/tmp/tmpov9ifeah/eval_partitions/nem_file_11', 42]
Segmentation fault (core dumped)
apcamargo commented 1 year ago

It seems that multiprocessing.Pool hanging if one process fails is a known bug in Python. Probably the best way to deal with this is fixing the NEM code to prevent the segmentation fault? I'll try to take a look at the code, but my C knowledge is pretty limited.

axbazin commented 1 year ago

Hi,

This looks similar to a problem we had a few years ago with NEM reaching infinite values when running PPanGGOLiN on too few genomes. Would it be possible for you to share the h5 file that lead to this error so someone from the team can try to look at this and fix it?

Adelme

apcamargo commented 1 year ago

Sure! Can I share this via email?

axbazin commented 1 year ago

If it is small enough to fit in an email yes sure ! You can send it to me at: adelme dot bazin at gmail dot com

apcamargo commented 1 year ago

Thanks! I've sent it to you.

axbazin commented 1 year ago

Got it thanks !

ggautreau commented 1 year ago

Hello @apcamargo,

@axbazin send me your .h5 file, and it appears that your input is not typical for usage in PPanGGOLiN since it represents MGEs (as you said) rather than entire genomes. This results in a substantially lower number of gene families (182) than a typical pangenome.

Nonetheless, PPanGGOLiN can still process your data if you modify some parameters.

You have two options: either include -fd during the partitioning phase (this removes the requirement for identical dispersion across all genomes in each partition of the Bernoulli mixture model) or reduce the maximum K value explored from 20 to 10 by adding -Kmm 2 10. In both cases, the persistent genome will be very similar (around 70 families).

Guillaume

apcamargo commented 1 year ago

Thank you, @ggautreau

I haven't considered using -fd before. Is there any biology underlying the decision of requiring identical dispersion across all genomes in a partition? Maybe this is the reason the problem was alleviated once I removed identical genomes? I'm just curious about what is causing the NEM to fail.

As you mentioned, reducing the K range is a good solution. However, I wonder if it is not a good idea to make the execution robust to such errors. It is not obvious to the user what is going on behind the scenes and it is still not clear to me what caused the segmentation fault. Of course, my use case is very niche, so it might not be worth your time to implement these safeguards.

ggautreau commented 1 year ago

During the development phase, I noticed that the increased total parameter count in the NEM model because of -fd can lead to small overfitting and therefore the inconsistency of partitioning when multiple resampling of the genomes set is performed. In your situation, it is plausible that the very few number of families could be the cause of the problem. Consequently, the NEM algorithm might have struggled to effectively handle partitions that were empty, especially when the value of K is high.

This issue necessitates a thorough inspection of the NEM's C code, specifically for this atypical circumstance. Unfortunately, I'm currently unable to allocate the required time for this task. I will leave this issue unresolved for now, with the intention to include an error message to handle such situations in a future version.

Guillaume.

apcamargo commented 1 year ago

That makes sense. I agree this is not really important, given how 99% of the users will use the tool. I can try to implement some safeguards myself.

Thanks for the support!