-
Hi there,
First of all, I want to express my admiration for the excellent work you've done on this project. The effort and dedication are truly commendable.
I have a couple of questions regardin…
-
Hi Ben - I'm running diamond linclust on a massive protein set and it's taking longer than expected based on the data shown in figure S1 of your manuscript.
In that figure, you reported 0.93 hours …
-
## Context
I would like to cluster my dataset, which contains approximately 35,000 protein sequences. I need to make some clusters in terms of their superfamilies. Thus, I would like to set the numbe…
-
Hi! I have an issue downloading the databases for HPC:
Looks like my install worked but I can't download the databases for HPC:
(/lustre/project/taw/share/conda-envs/hecatomb) [kvigil@cypress2 c…
-
Hi,
I'm trying to use the taxonomy feature and when I do, my output DB seems to be split in many smaller DBs. Is there any way to control this split? I'd like to just turn it off. I have 1 TB of me…
-
Hi, there
I am currently using mmseqs to cluster more than 20 billion protein sequences. I intend to complete the task by running created, clusthash and linclust module. However, the createdb modul…
-
Given that lack of scalability for all-vs-all blastn, it would be great to have the option to use `mmseqs linclust` as an alternative
-
Hi, thanks for making this toolkit! I'm excited to start using it with my data.
I have a set of viral genomes that I would like to cluster. From the wiki and the paper, I understand that linclust b…
-
I'm evaluating PPanGGOLiN on MGE genomes and I noticed that some genomes contained multiple genes within the same cluster, which you wouldn't expect for very compact genomes. Upon further investigati…
-
## Bug report
### Expected behavior and actual behavior
Described briefly in this Gitter comment, [here](https://gitter.im/nextflow-io/nextflow?at=5e4640e5b401eb68a5783830), I have a fairly lar…