Open mariusmessemaker opened 4 years ago
@mariusmessemaker , could you please tell us, do you use this generated whitelist or allows zUMIs to autogenerate it by itself, and does it affect the final results?
Thanks
Hi, I have a question related to barcodes generation, but for Split-seq:
I am analysing experiments with four barcodes, so should I choose to provide a whitelist to zUMIs, would I concatenate the barcodes following a 3' -> 5' direction, or R1, then R2?
The experimental design is such that reads are barcoded as follows: R1 has UMI+BC1+cDNA, R2 has BC2+GGG+cDNA, and additionally I have two more index files with another barcode each. All barcodes are 8nt, so my cells will be identified by a 32nt barcode.
Does it matter how I concatenate the barcodes? Do I anyway create a barcodes list with all possible combinations?
thanks
Hi,
If you choose to provide a whitelist, you would concatenate the barcode pieces in the order you are providing the barcode ranges to use in the YAML file. So if your file1 has R1 then start with that barcode, etc. For reference: https://github.com/sdparekh/zUMIs/wiki/Barcodes#barcode-annotation
Best, Christoph
Hi, thank you Christoph. It makes sense now, that being the way zUMIs reads in the files!!
Cheers
I wrote a python function to generate inDrops V3 whitelist from the
gel_barcode2_list.txt
file (mostly adapted from the indrops.py code): https://github.com/indrops/indrops/blob/master/ref/barcode_lists/gel_barcode2_list.txt. I thought some zUMIs users might want to use this to generate their own whitelists to supply to zUMIs. Function arguments are:indexes:
list of strings that contain the library adapter sequences that were used to generate the libraries (e.g. ['AGAGGATA', 'TACTCCTT']).name:
string that contains the save.txt
name for your whitelist.indexlength:
int that specifies the number of bases in your library index (to make sure you don't make manual copy errors)numberOflibraries
: int that specifies the number of libraries that were generated (i.e. the number of different library indexes that were used; also to make sure you don't make manual copy errors).The function outputs the Cartesian product of all the R2 BC1s, R3 library indexes, R3 BC2s in concatenated strings in the order:
The function:
For example, you can use the function as follows:
This function call outputs 384 BC1 x 8 library indexes x 384 BC2 = 1179648 concatenated BC strings. You can also supply an empty library index string, which will yield the cartesian product of the R2 BC1s and R4 BC2s: