Closed rutha32 closed 8 months ago
The most likely issue is that your V-gene names are not recognized. Do they have allele level resolution? If not, you can add "*01" for approximate result. V-genes must match one of the following values in the id columns --
https://github.com/kmayerb/tcrdist3/blob/master/tcrdist/db/alphabeta_gammadelta_db.tsv
Alternatively you can define cdr1_a_aa, cdr2_a_aa, pmhc_a_aa your self instead of using TCRdist initialization to infer them:
see infer_cdrs = False.
Can you provide 10 lines of your input data?
On Thu, Nov 2, 2023 at 1:31 PM rutha32 @.***> wrote:
Hi, tcrdist works fine when I use the sample data, but when I try it with other datasets, I'm getting errors. These are my columns: 'subject', 'epitope', 'count', 'v_a_gene', 'd_call', 'j_a_gene', 'cdr3_a_aa', 'cdr3_a_nucseq', 'junction', 'decombinator_id', 'rev_comp', 'productive', 'sequence_aa', 'cdr1_aa', 'cdr2_aa', 'chain', 'clone_id', 'time'], dtype='object'
this is the error I get ValueError: zero-size array to reduction operation maximum which has no identity
My code import pandas as pd Define the file path
file_path = r'C:\Users\pythonProject\ResearchProject\alpha_TCR_all_sample_100.csv' Read the CSV file into a DataFrame
df = pd.read_csv(file_path) Display the first few rows of the DataFrame
df.head() from tcrdist.repertoire import TCRrep Assuming you've already loaded your data into the 'df' DataFrame
tr = TCRrep( cell_df=df, organism='human', chains=['alpha'], db_file='alphabeta_gammadelta_db.tsv' ) Calculate pairwise distances for the alpha chain
pw_alpha = tr.pw_alpha
Thanks
— Reply to this email directly, view it on GitHub https://github.com/kmayerb/tcrdist3/issues/94, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALD2PVZP2PPBN5AC6CGPZQTYCP7IZAVCNFSM6AAAAAA63PBZC6VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE3TIOJYGY4TGNI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Hi thanks for the reply, I got it working when I added the "*01". I removed the some of the columns and only kept the core columns count , v_a_gene, j_a_gene and cdr3_a_aa.
tcrdist_alpha_sample.pdf
Hi, tcrdist works fine when I use the sample data (dash.csv), but when I try it with other datasets, I'm getting errors.
These are my columns: 'subject', 'epitope', 'count', 'v_a_gene', 'd_call', 'j_a_gene', 'cdr3_a_aa', 'cdr3_a_nucseq', 'junction', 'decombinator_id', 'rev_comp', 'productive', 'sequence_aa', 'cdr1_aa', 'cdr2_aa', 'chain', 'clone_id', 'time'], dtype='object'
this is the error I get ValueError: zero-size array to reduction operation maximum which has no identity
My code import pandas as pd
file_path = r'C:\Users\pythonProject\ResearchProject\alpha_TCR_all_sample_100.csv'
df = pd.read_csv(file_path)
df.head() from tcrdist.repertoire import TCRrep
tr = TCRrep( cell_df=df, organism='human', chains=['alpha'], db_file='alphabeta_gammadelta_db.tsv' )
pw_alpha = tr.pw_alpha
Thanks