[suggestion] Remove --stranded and --has-control flags; infer these from input_data.json

This is just a suggestion, but I wanted to log it some place where we have a formal channel to discuss. Currently, in the input_data.json file, the user must provide control tracks, and also must indicate the strandiness of the data. Then, she must also invoke --stranded and --has-control in a whole slew of scripts. I propose we just rely on input_data.json to infer the strandiness and controlledness of the data. This also would allow for a clean way to have datasets of mixed strandednesses and controllednesses. For example, `{ "task_nanog_plus" : {"strand" : 0, "task_id" : 0, "signal": [...], "peaks" : [...], "control" : [...]} "task_nanog_minus" : {"strand" : 1, "task_id" : 0, [signal, peaks, control]} "task_mnase" : { "strand" : 0 "task_id" : 1, [signal, peaks]} //control omitted } would be a valid input. In the case of mixed strandednesses, it would construct a model with the appropriate number of outputs (2*n_stranded_inputs + n_unstranded_inputs) and then either (1.) the model would only expect the number of control tracks listed in the input or (2.) the model would expect a control track for every output, but the code would supply bias tracks full of zeroes in cases where the user has not provided one.

I need this sort of functionality because I'm mixing and matching all sorts of data types, some stranded, some controlled, and some both stranded and controlled.

Your thoughts on this proposal?

kundajelab / basepairmodels

[suggestion] Remove --stranded and --has-control flags; infer these from input_data.json #14