steineggerlab / foldseek

Foldseek enables fast and sensitive comparisons of large structure sets.
https://foldseek.com
GNU General Public License v3.0
695 stars 92 forks source link

Easy cluster - Cannot use --lddt-threshold with --sort-by-structure-bits 0 #228

Open ruthalee opened 5 months ago

ruthalee commented 5 months ago

Hi, I am trying to cluster some pdb files with varying lddt thresholds, but -lddt-threshold gets turned off during the run due to --sort-by-structure-bits 0. I saw in the readme that --sort-by-structure-bits can be disabled, but I don't see how to do that. Can you help with this?

My code: foldseek easy-cluster /ranked_0_pdbs Geo_OmcSTJ tmp -c 0.9 --lddt-threshold 0.9

Notification: Cannot use --lddt-threshold with --sort-by-structure-bits 0 Disabling --lddt-threshold

Thank you!

martin-steinegger commented 5 months ago

What version do you use? I tried the most recent commit and could not reproduce this issue.

martin-steinegger commented 5 months ago

Could it be that you do not have a Calpha database?

ruthalee commented 5 months ago

I installed foldseek last week using: conda install -c conda-forge -c bioconda foldseek foldseek Version: 8.ef4e960 Where would I check for the Calpha database? Thank you!

ruthalee commented 5 months ago

@martin-steinegger Hi, with the info for the release I am using there was a bug fix. Unless I am reading this wrong, it seem like my 'error' is what it was supposed to do? I ran the script again with --sort-by-structure-bits 1 and --lddt-thr 0.9 and it did what I wanted it to do. The clusters now look a lot like the gene tree.

--lddt-thr and --tmscore-thr are ignored when--sort-by-structure-bits 0` is set (https://github.com/steineggerlab/foldseek/commit/b1b4710c5bb7fc42e3c73974a63153d86d77386a)

bool needTMaligner = (par.tmScoreThr > 0);
bool needLDDT = (par.lddtThr > 0);
if(par.sortByStructureBits){
if (par.sortByStructureBits) {
    needLDDT = true;
    needTMaligner = true;
} else {
    if (needTMaligner) {
        Debug(Debug::WARNING) << "Cannot use --tmscore-threshold with --sort-by-structure-bits 0\n"
                              << "Disabling --tmscore-threshold\n";
        needTMaligner = false;
    }
    if (needLDDT) {
        Debug(Debug::WARNING) << "Cannot use --lddt-threshold with --sort-by-structure-bits 0\n"
                              << "Disabling --lddt-threshold\n";
        needLDDT = false;
    }
}
bool needCalpha = (needTMaligner || needLDDT);
IndexReader *qcadbr = NULL;
martin-steinegger commented 4 months ago

Below is what the code currently look like. This does not require the --sort-by-structure-bits 0.

    bool needTMaligner = (par.tmScoreThr > 0);
    bool needLDDT = (par.lddtThr > 0);
    if (par.sortByStructureBits) {
        needLDDT = true;
        needTMaligner = true;
    } else {
        if (needTMaligner && (db1CaExist == false || db2CaExist == false)) {
            Debug(Debug::WARNING) << "Cannot use --tmscore-threshold with --sort-by-structure-bits 0\n"
                                  << "Disabling --tmscore-threshold\n";
            needTMaligner = false;
        }
        if (needLDDT && (db1CaExist == false || db2CaExist == false)) {
            Debug(Debug::WARNING) << "Cannot use --lddt-threshold with --sort-by-structure-bits 0\n"
                                  << "Disabling --lddt-threshold\n";
            needLDDT = false;
        }
    }