I'm really interested but i'm struggling to understand how do you obtain the variant reads counts matrix ? My guess is that you paired VAF variants with total reads counts of his amplicons ? If so won't it be biased by non-uniformity of the reads coverage of the so-called amplicon ? if it has been compute with an other method would it be possible to share it ?
You seem to also recommend to add the the clustering (with or without the mutation matrix ?) what is the mutation matrix in tapestri ? are you refering to the genotype matrix (where in each cell/barcode. 0: is wildtype, 1: one allele is alternate, 2: both alleles are alternate, 3: Missing genotype) ?
Also i'm not sure to understand what is the command argument -k and what it is doing ? maximum number of losses for an SNV is a bit vague.
Also would it be possible to provide a conda yaml environment or even a singularity / docker ? it would greatly simplify the use of your tools !
I'm aware it is a lot of questions, so thank you in advance for your time and kindness
Thank you for your interest in using this tool and sorry for the late response.
Variant read count matrix -- yes, you are correct. This matrix contains the number of variant reads for each mutation in each cell. You can generate this matrix by multiplying the VAF with the total read counts (reference reads + variant reads) for each mutation.
Mutation matrix -- I have defined the mutation matrix in the paper (I updated the README to contain a link to the paper). Mutation matrix has entry of 0 if the mutation is abset, 1 if it is present and -1 if there is no information about the mutation in that cell.
conda yaml -- Thank you for the suggestion. I will work upload a yaml file to use this tool in the next few days. The package requirement for ConDoR are very simple. We only need numpy, pandas, networkx and gurobipy.
Hello,
Thank you for this tool !
I'm really interested but i'm struggling to understand how do you obtain the variant reads counts matrix ? My guess is that you paired VAF variants with total reads counts of his amplicons ? If so won't it be biased by non-uniformity of the reads coverage of the so-called amplicon ? if it has been compute with an other method would it be possible to share it ?
You seem to also recommend to add the the clustering (with or without the mutation matrix ?) what is the mutation matrix in tapestri ? are you refering to the genotype matrix (where in each cell/barcode. 0: is wildtype, 1: one allele is alternate, 2: both alleles are alternate, 3: Missing genotype) ?
Also i'm not sure to understand what is the command argument -k and what it is doing ? maximum number of losses for an SNV is a bit vague.
Also would it be possible to provide a conda yaml environment or even a singularity / docker ? it would greatly simplify the use of your tools !
I'm aware it is a lot of questions, so thank you in advance for your time and kindness