Teichlab / bracer

BraCeR - reconstruction of B cell receptor sequences from single-cell RNAseq data
Other
40 stars 22 forks source link

Custom databases #34

Open scharch opened 5 years ago

scharch commented 5 years ago

I'm trying to use bracer with a custom gene database (which is actually a superset of the standard IMGT database). After the documentation issues in #31, I did include a database and created a gapped alignment that conforms to the IMGT standard, which at least lets me run bracer assemble without a crash. However, my heavy chains are getting lost somewhere in the mix. I can detect them with the standard Human database. Moreover, when I look at the output from the individual steps, I can see roughly the same hits from the Bowtie alignment, Trinity assembly, and constant region BLAST. However, the heavy chains are failing to "pass" the IgBLAST step. It looks like it's having trouble detecting CDR3 and J, so maybe this is a problem with the way I built the combinatorial_recombinomes?

idalind commented 5 years ago

Hi, Could you please send me the references you built as well as the output you got for one cell using your reference as well as using the standard human database so I can look into what could be the issue? My email address is now ida.lindeman@medisin.uio.no

Thanks, Ida

scharch commented 5 years ago

I sent you a link to a tarball in google drive

idalind commented 5 years ago

Hi again, and sorry for the late reply. My first guess is that this could have something to do with the igblast auxiliary file. It looks like you did not provide the auxiliary file when running bracer build (this is not required as of now, but since bracer has been more and more dependent on CDR3 detection this file and not custom scripts, it probably should be at least for species where such data is available. Sorry about that!). Could you please try to copy the auxiliary file from the bracer repository into the corresponding directory in your custom reference directory and let me know if this solves the problem? If so, I will make it more clear in the documentation that the --igblast_aux argument is highly recommended to use.

scharch commented 5 years ago

It helped with the kappas, but not the heavy chain (for this one test cell):

$ head test-Hpseud/filtered_BCR_seqs/filtered_BCRs.txt 
------------------
test2
------------------
BCR_H recombinants: 0/0
BCR_K recombinants: 0/2
BCR_L recombinants: 0/1

#BCR_H#
#BCR_K#

$ head retest_with_auxFile/filtered_BCR_seqs/filtered_BCRs.txt 
------------------
retest_with_auxFile
------------------
BCR_H recombinants: 0/0
BCR_K recombinants: 2/2
BCR_L recombinants: 0/0

#BCR_H#
#BCR_K#