Open bregman3 opened 1 month ago
Could you please post the full command line call and terminal output? Additionally please post an excerpt of the result file. This doesn't sound like something that should happen without explicitly requesting some parameters (i.e. --alt-ali
).
Hello here is the command line call: foldseek easy-search --exhaustive-search --max-seqs 10000 5sxy.pdb $BIODB/afdb/afdb aln4 FoldSeek I realize the max seqs is not useful due to the exhaustive search skipping the prefilter. My input is a single chain PDB. I'm a little confused because in the output, which I've just copy and pasted into an excel sheet so I could highlight the same repeated hit, it also has different models for my query protein despite the single input.
That's an NMR structure. Each model becomes another query, which results into likely exactly the same result list for each query.
NMR structures are a bit of a footgun with foldseek.
oh okay thank you so much!
Hello I used the easy-search function against the AlphaFold database with the exhaustive search metrics and I noticed my results were greatly conflated where I would have the same protein ID hit multiple times. My query structure was a PDB file that is a singular chain. The hit would be also a singular chain, and I checked to make sure it only occurred once in both the AFDB and the UniProt DB. I also noticed my downloaded file for the afdb is I use the github is significantly smaller than what the afdb website says that database should be. Does anyone else have these issues?