UCSF-DSCOLAB / data_processing_pipelines

A repository to store the existing pipelines to process the various CoLabs datasets
0 stars 1 forks source link

sc_seq pipeline should pull in or organize more TCR/BCR info #68

Open dtm2451 opened 5 months ago

dtm2451 commented 5 months ago

Per recent discussions, there is additional data we would want to use from tcr and bcr libraries than only the clonotype_id and cdr3 amino acid sequences that are currently pulled in.

(The updated function would build this dataframe by ';' join these data from the multiple lines in the all_contig_annotations.csv file that have matching 'barcode' + 'raw_contig_annotation' values.)

Questions:

  1. All columns in the latter suggestion?
  2. Should this get output to metadata file, or pull ALL these many columns into the Seurat object?
  3. If metadata file where? automated_processing/(T|B)CR_contigs.csv?