Closed hoelzer closed 3 years ago
The mmseqs2 step is now split into one sub workflow and a process:
Workflow mmseqs2_dbs: The workflow checks whether the query db, target db and target index files exist already. The tar.gz files are then either loaded or built in the corresponding processes.
Process mmseqs2 The process takes the loaded dbs and indexed target db to run the search for the hyprots.
Why do we actually index the query and the target? :) Because we perform the mmseqs2 alignment in both directions?
In order for mmseqs2 to perform the search both the query and target FASTA files need to be converted into sequence databases.
Ah, thanks for the info! I had in mind that blast-like-style only on index db of the target is needed.
@EvaFriederike I am not 100% sure yet how to handle the mmseqs2 step.
Currently, everything happens in one process. Here, the
mmseqs.sh
script is called that does the indexing of the db and then performs the run.I think bc/ nextflow generates a new tmp working dir every time the indexing is run again and again.
I suggest to separate this:
mmseqs2.sh