larssnip / micropan

R package for microbial pangenomics
20 stars 0 forks source link

Too many genomes? #4

Open alouyakis opened 4 years ago

alouyakis commented 4 years ago

Hi. I'm trying to run 84 genomes through micropan. I'm getting an error that must be related to the number of genomes, but wanted to run it past you. Any ideas for a fix other than reducing the genome number?

> cluster.blast_complete <- bClust(blast.distances, linkage = "complete", threshold=0.7)
bClust:
...constructing graph with 368291 sequences (nodes) and 16368142 distances (edges)
...found 33605 single linkage clusters
...found 3188 incomplete clusters, splitting:
...........Error in hclust(as.dist(dmat), method = "complete") : 
  size cannot be NA nor exceed 65536
larssnip commented 3 years ago

Sorry for my late reply.

I don't think its the number of genomes, 84 should not be a problem as such, but the memory available to R is always limited by the computer where it is running.

Could it be the NA that is the problem here? For some reason? I would start by inspecting the blast-distances, are there any NA's?