Closed grst closed 4 years ago
In GitLab by @grst on Jan 24, 2020, 10:10
We use the file filtered_contig_annotations.csv
.
# Human B cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_hs_pbmc3_b_filtered_contig_annotations.csv | sort | uniq -c
1 chain
929 IGH
624 IGK
506 IGL
# Human T cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_hs_pbmc3_t_filtered_contig_annotations.csv | sort | uniq -c
1 chain
46 Multi
4907 TRA
5168 TRB
# Mouse B cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_mm_pbmc4_b_filtered_contig_annotations.csv| sort | uniq -c
1 chain
5215 IGH
5475 IGK
2573 IGL
# Mouse T cell chains
(default) sturm@hochvogel Downloads % cut -f6 -d, vdj_v1_mm_pbmc4_t_filtered_contig_annotations.csv| sort | uniq -c
1 chain
7 Multi
761 TRA
1301 TRB
There are indeed a bunch of barcodes that have more than 4 chains, e.g.
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_1 True 648 TRB TRBV19 None TRBJ2-1 TRBC2 True True CASSISTDWGNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_2 True 511 TRA TRAV23/DV6 None TRAJ58 TRAC True True CAASQETSGSRLTF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_3 True 521 TRB TRBV6-5 None TRBJ2-1 TRBC2 True True CASSYRTGSSYNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_4 True 659 TRA TRAV8-6 None TRAJ6 TRAC True True CAVNPGGSYIPTF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_5 True 427 TRB TRBV7-8 None TRBJ2-1 TRBC2 True False CQQLRKTSYNEQFF
TTTACTGTCACCAGGC-1 True TTTACTGTCACCAGGC-1_contig_6 True 463 TRA TRAV13-1 None TRAJ4 TRAC True False CSKFLFSGGYNKLIF
But for those I checked, there are only four that are productive. For now, I think it's fine to just use the productive chains and emit a warning that there might be more.
In GitLab by @grst on Jan 24, 2020, 10:55
Files to use:
recombinants.txt
: Output of tracer summary
. Contains CDR3 sequences, V and J gene of the filtered chains. <CELL>/filtered_TCR_seqs/filtered_TCRs.txt
. Contains V gene, J gene, CDR3 seqs and TPM of the filtered chains. Seems to be the only way to get the TPMs unfortunately. For now, let's go for TraCeR alpha/beta only.
In GitLab by @szabogtamas on Jan 24, 2020, 13:04
Totally agree: even four chains is a lot and we can safely assume that there shouldn't be more than four productive chains in a cell. Since we only have datasets with alpha/beta, I would also leave gamma/delta for now. We can include them later.
In GitLab by @grst on Feb 14, 2020, 16:18
alpha/beta is fine for now. But this needs to be documented.
In GitLab by @grst on Mar 27, 2020, 11:31
assigned to @szabogtamas
The original issue
could not be created. This is a dummy issue, replacing the original one. It contains everything but the original issue description. In case the gitlab repository is still existing, visit the following link to show the original issue:
TODO