Open lithces opened 7 months ago
Thanks for your attention, can you provide the process of executing the code?
The MSA step is triggered according to the section "antibody-antigen complex" in the Readme file:
python projects/tfold_ag/gen_msa.py --fasta_file=examples/fasta.files/myseq.fasta --output_dir=examples/myseq
In myseq.fasta
I include two chain in antibody and one chain in the antigen.
There is no error shown up during the process, however in the following. "predict.py" there are messages claiming that key error in a dictionary. Where the problematic key is from the generated a3m from the MSA step in the end: 0x00.
Thank you for your interest in our work.
Regarding your question, in myseq.fasta
, you only need to input the sequence information of the antigen. tFold-Ag does not need to search for the MSA of the antibody, which brings a significant speed advantage.
Additionally, if you know the epitope of the antigen, you can also try to input the epitope information, which will bring a significant performance improvement.
I have the same issue, it seems that mmseqs uses 0x00 to separate different query sequences.
tFold-Ag is currently unable to handle multi-chain antigen/multi-antibody input scenarios. The sequence used to construct the MSA is single chain antigen.
Hi,
I am recently trying tfold on an internal sequence but failed.
I did some analysis and the issue is with the MSA step. The generated a3m file contains illegal characters. In my case it is 0x00 at the end of the a3m file.
I know it would be less productive because I can not provide the sequence. Still I am trying to ask for some hints on how could the illegal characters appears during the MSA step?
Thanks, Ruijiang