functional-dark-side / agnostos-wf

44 stars 15 forks source link

error in step cluster_compositional_validation #11

Open mherold1 opened 2 years ago

mherold1 commented 2 years ago

Hello,

I have been trying to get the db_creation workflow to run and I am stuck at the step cluster_compositional_validation: In the logfile (logs/cval_stderr.err) it reads:

Invalid database read for database data file=/project/scratch/p200005/agnostos_test/db_creation2/mmseqs_clustering/clu_seqDB, database index=/project/scratch/p200005/agnostos_test/db_creation2/mmseqs_clustering/clu_seqDB.index
Size of data: 2340609
Requested offset: 2584994
Can not open index file /project/scratch/p200005/agnostos_test/db_creation2/compositional_validation/comp_valDB_tmp_0_tmp_1.index!
srun: error: mel0075: task 0: Exited with exit code 1

The size of "request offset" varied in the different testruns that I made. I can do mmseqs view on the database clu_seqDB without error.

In the output folder compositional_validation there are several tmp index files, most ending on .0 except comp_valDB_tmp_0_tmp_0.index, but not the one from the log file is missing:

$ ls
SSN
alignments
comp_valDB_tmp_0
comp_valDB_tmp_0_tmp_0.index
comp_valDB_tmp_0_tmp_0.index.0
comp_valDB_tmp_0_tmp_1.index.0
comp_valDB_tmp_0_tmp_10.index.0
....

I tried to look into the script compositional_validation.sh, but I cannot find what could be going wrong at this step. Any help would be appreciated.

Best regards.

ChiaraVanni commented 2 years ago

Hi @mherold1! which version of MMseqs2 are you using? I could try to reproduce your error if you share the clu_seqDB (or a subset of it if too big)

All the best,

Chiara

mherold1 commented 2 years ago

Hi and thanks for the quick reply. I tried to follow the installation script and I'm using the following version of MMseqs2: MMseqs Version: 2f1db01c5109b07db23dc06df9d232e82b1b4b99-MPI I attached my mmseqs_clustering directory: mmseqs_clustering.tar.gz I was using the test dataset: https://ndownloader.figshare.com/files/25473332

Best regards, Malte

ChiaraVanni commented 2 years ago

Hi, I found the problem. No threads threshold was set for the clu_seqDB creation (now fixed: https://github.com/functional-dark-side/agnostos-wf/blob/b649044d359b9a43a2b4194e6d77661206000549/db_creation/rules/mmseqs_clustering_results.smk#L45). The number of threads determines the number of DB files MMseqs is creating and the number of files that then have to be concatenated in a single DB. In our testing cloud, the max number of threads used by MMseqs was the same as the default threads specified in the rule to concatenate the files. In your case not, causing some of the clu_seqDB files to be left out from the final DB.

To avoid rerunning the entire rule, you can recreate the clu_seqDB:

https://github.com/functional-dark-side/agnostos-wf/blob/b649044d359b9a43a2b4194e6d77661206000549/db_creation/rules/mmseqs_clustering_results.smk#L45

and re-concatenate the files:

https://github.com/functional-dark-side/agnostos-wf/blob/b649044d359b9a43a2b4194e6d77661206000549/db_creation/rules/mmseqs_clustering_results.smk#L96

This will not affect the results and the other rules.

Let me know if it works!

Best regards,

Chiara

mherold1 commented 2 years ago

Thanks!

Changing L45 to: {params.mmseqs_bin} createseqfiledb {params.seqdb} {params.cludb} {params.cluseqdb} --threads {threads} 2>{log.err} solved this for me.

Actually I had a very similar issue before when the number of threads for rule mmseqs_clustering (set as 28 in the rule) was not the same as the number of threads in mmseqs_clustering_results (set in config.yaml)

In general I am a bit confused on the amount of resources to provide for each step, and what the hierarchy is, when for certain steps e.g. the number of threads is defined in config/cluster.yaml and in the rule itself. I set all the threads parameters and values in the cluster config file back to the default and that helped somewhat until the step cluster_classification which takes very long or fails at this step:

scripts/mmseqs_double_search.sh --search /project/home/p200005/agnostos-wf/bin/mmseqs --mpi_runner 'srun --mpi=pspmi' --ltmp /project/scratch/p200005/tmp --cons /project/scratch/p200005/agnostos_test/db_creation3_default/cluster_classification/refined_not_annotated_cluster_cons.fasta --db_target /project/home/p200005/agnostos-wf/databases/uniref90.db --db_info /project/home/p200005/agnostos-wf/databases/uniref90.proteins.tsv.gz --evalue_filter scripts/evalue_filter.awk --evalue_threshold 0.6 --hypo_threshold 1.0 --hypo_patterns scripts/hypothetical_grep.tsv --grep rg --output /project/scratch/p200005/agnostos_test/db_creation3_default/cluster_classification/noannot_vs_uniref90.tsv --outdir /project/scratch/p200005/agnostos_test/db_creation3_default/cluster_classification --threads 28