phbradley / conga

Clonotype Neighbor Graph Analysis
MIT License
80 stars 18 forks source link

Question: Is it possible to run with BCR sequences #25

Open ghost opened 3 years ago

ghost commented 3 years ago

Hello! Thanks for developing this tool. I've been exploring TCR data with it successfully. I would like to extend it to BCR data. I found the organism option human_ig, but after running conga.tcrdist.make_10x_clones_file.make_10x_clones_file, I get no ab_counts. I'm simply followed the jupyter notebook tutorial. I assume you must have considered the idea given that you have the organism option for it.

Are there any tips you could provide or is this an unreasonable approach? I appreciate the tool is geared towards TCR. Is there perhaps a kwarg that I could use to adjust similarity cutoffs and any necessary adjustments for BCR? I recognize you've tailored the distance metric for TCR interactions.

after running: conga.tcrdist.make_10x_clones_file.make_10x_clones_file(bcr_datafile, organism, clones_file) I get the following (in a jupyter notebook): ab_counts: [] old_unpaired_barcodes: 0 old_paired_barcodes: 0 new_stringent_paired_barcodes: 0

There are paired barcodes that have both light and heavy chains (row for each) - same data was successfully used the the scirpy package.

The BCRs are in a dataframe similar to what I used for the TCRs - coming from 10X files. Columns are: 'barcode', 'is_cell', 'contig_id', 'high_confidence', 'length', 'chain', 'v_gene', 'd_gene', 'j_gene', 'c_gene', 'full_length', 'productive', 'cdr3', 'cdr3_nt', 'reads', 'umis', 'raw_clonotype_id', 'raw_consensus_id']

Any thoughts are appreciated.

Thank you!

phbradley commented 3 years ago

Hi Hugh, THanks for your interest in conga. We should definitely be able to make this work. Did you already take a look at the BCR example in the readme?

https://github.com/phbradley/conga#human-melanoma-b-cell-dataset

If that approach (ie, running from the command line) doesn't work either, it may be that the contigs file format is a little different. We've definitely analyzed other BCR datasets successfully, but 10x could always change the format.

Take care, Phil

ghost commented 3 years ago

Thank you Phil. Sorry that I missed that section. I was working off the notebook so I'll try running it from the cli as in the example.

phbradley commented 3 years ago

OK, sounds good. We still want to make it work from the notebook too; testing from cli will be helpful for trouble shooting.