matsengrp / vampire

🧛 Deep generative models for TCR sequences 🧛
Apache License 2.0
16 stars 4 forks source link

Better tooling for developing merged training sets and specifying testing sets. #74

Closed matsen closed 5 years ago

matsen commented 5 years ago

Idea: a util subcommand that randomly splits data sets into train and test, and spits out

  1. a train file with the merged training set
  2. a file with a list of the data sets to be used for test.

We don't want to merge the test data sets: each repertoire is one data set.

matsen commented 5 years ago

Closed by https://github.com/matsengrp/vampire/commit/c01186cd55e9b27a5abda2c997fde25b1275f627