HetzDra / turboGliph

R implementation of GLIPH (Grouping of Lymphocyte Interactions by Paratope Hotspots), an algorithm developed by Glanville et al to identify specificity groups in the T cell receptor repertoire based on local (motif sharing) and global (hamming distance) similarities.
17 stars 4 forks source link

Error when structboundaries = FALSE in gliph2 function #5

Open CharlineJnnt opened 1 year ago

CharlineJnnt commented 1 year ago

Hi, I have encountered an issue when I try to set the parameter structboundaries =FALSE in the gliph2 function. The error was:

> res_turbogliph <- turboGliph::gliph2(cdr3_sequences = data,
                                     n_cores = 1,
                                     refdb_beta = refdb_beta_data,
                                     v_usage_freq = v_usage_freq_data,
                                     cdr3_length_freq = cdr3_length_freq_data,
                                     result_folder = paste0(directory, "GLIPH2/alpha_beta_bis/"), 
                                     lcminove = 10, 
                                     global_vgene = TRUE, 
                                     lcminp = 0.001, 
                                     all_aa_interchangeable = TRUE, 
                                     motif_length = base::c(2,3,4,5),
                                     motif_distance_cutoff = 3,
                                     structboundaries = F)
Notification: cdr3_sequences is a data frame and the column named "CDR3b" is considered as cdr3 beta sequences.
Notification: Column of reference database named 'CDR3b' is considered as cdr3 sequences.
Notification: Column of reference database named 'TRBV' is considered as V gene information.
Part 1: Searching for local similarities.
181 significantly enriched motifs found in sample set.
Part 1 cpu time: 1.360122 mins

Part 2: Searching for global similarities.
Error in base::seq_len(base::min(base::max(reference_seqs$nchar), base::max(sample_seqs$nchar))) :
  argument must be coercible to non-negative integer
In addition: Warning message:
In base::max(sample_seqs$nchar) :
  no non-missing arguments to max; returning -Inf

Can you explain the origin of this error ?

I have another question, I've noticed that the default parameters of the gliph2 function are not the same as the original algorithm, so, what are the exact parameters I need to change to get the same result as the GLIPH2 executable?

Thank you for your help !