In zero-shot setting, If my input csv's "Peptide" column not sorted and have same peptide in nonadjacent rows,the result will be wrong. I think it may cause by below code:
'''for i,j in enumerate(peptides):
if j not in Z_data:
Z_data[j] = []
Z_data[j].append(TCRs[i])'''
'''output = pd.DataFrame({'Peptide':peptides,'CDR3':TCRs,'Score':starts})'''
The order of output document of "Peptide" and "CDR3" is different with the order of model input of "Peptide" and "CDR3".
If I am wrong, please point it out and I am very sorry about it.
Hi:
Thanks for your interest in our work. We appreciated that you raised this bug in our code. You are right! We have highlighted the importance of sorting peptides in the input *csv in README.
In zero-shot setting, If my input csv's "Peptide" column not sorted and have same peptide in nonadjacent rows,the result will be wrong. I think it may cause by below code: '''for i,j in enumerate(peptides): if j not in Z_data: Z_data[j] = [] Z_data[j].append(TCRs[i])''' '''output = pd.DataFrame({'Peptide':peptides,'CDR3':TCRs,'Score':starts})'''
The order of output document of "Peptide" and "CDR3" is different with the order of model input of "Peptide" and "CDR3".
If I am wrong, please point it out and I am very sorry about it.