Teichlab / tracer

TraCeR - reconstruction of T cell receptor sequences from single-cell RNAseq data
Other
124 stars 48 forks source link

Implement `build` module for custom reference files #11

Closed mstubb closed 7 years ago

mstubb commented 8 years ago

The build module should take sets of V and J (and C) sequences and make the appropriate resources.

Things to think about for the build module.

  1. C sequences are currently hard-coded. These should be moved out into resource files and loaded as appropriate for the necessary species.
  2. iNKT seqs are similarly hard-coded. Should also split these out and make them optional if the expected sequences are not known.
  3. Should change resources directory structure so that it goes something like:
species ----> imgt_seqs
        |
        |---> synthetic_genomes
        |
        |---> igblast_dbs
        |
        |---> const_seqs
        |
        |---> invariant_cells

Tracer can then check that a particular species exists before loading the appropriate data.

The invariant_cells folder could contain sequences for any kind of invariant cell so that it could also deal with MAIT cells and anything else.