Open zunyun-Gong opened 7 months ago
Hi @zunyun-Gong
We do not plan to develop a version that takes in customized background TCRs. But the section of code related to background is somewhere here:
TCR_neg_df_1k=pd.read_csv(library_dir+'/bg_tcr_library/TCR_output_1k.csv', names=pd.RangeIndex(0, 30,1), header=None, skiprows=1)
TCR_neg_df_10k=pd.read_csv(library_dir+'/bg_tcr_library/TCR_output_10k.csv', names=pd.RangeIndex(0, 30,1), header=None, skiprows=1)
# As of the state of the software this step looks redundant and a waste of memory as it is loading an object that is already in memory but using a new variable name
# TCR_pos_df=pd.read_csv(output_dir+'/TCR_output.csv',index_col=0)
# MHC_antigen_df=pd.read_csv(output_dir+'/MHC_antigen_output.csv',index_col=0)
################ make prediction #################
rank_output=[]
for each_data_index in range(TCR_encoded_matrix.shape[0]):
tcr_pos=TCR_encoded_matrix.iloc[[each_data_index,]]
pmhc=HLA_antigen_encoded_matrix.iloc[[each_data_index,]]
#used the positive pair with 1k negative tcr to form a 1001 data frame for prediction
TCR_input_df=pd.concat([tcr_pos,TCR_neg_df_1k],axis=0)
MHC_antigen_input_df= pd.DataFrame(np.repeat(pmhc.values,1001,axis=0))
prediction=ternary_prediction.predict({'pos_in':TCR_input_df,'hla_antigen_in':MHC_antigen_input_df})
rank=1-(sorted(prediction.tolist()).index(prediction.tolist()[0])+1)/1000
#if rank is higher than top 2% use 10k background TCR
if rank<0.02:
TCR_input_df=pd.concat([tcr_pos,TCR_neg_df_10k],axis=0)
MHC_antigen_input_df= pd.DataFrame(np.repeat(pmhc.values,10001,axis=0))
prediction=ternary_prediction.predict({'pos_in':TCR_input_df,'hla_antigen_in':MHC_antigen_input_df})
rank=1-(sorted(prediction.tolist()).index(prediction.tolist()[0])+1)/10000
rank_output.append(rank)
You might want to modify our codes to take in your customized TCRs
Thanks!
Tao
Thank you for your help, it is quiet helpfull for my own work.
Thank you for your outstanding work. I prepare to use my own background in my project, and I read through your code and paper but I don not know how to do it. Can you get a reference pipeline for it? Thank you very much!