The current implementation has us calling find_imbalance.r for each sample, which is quite inefficient.
We can take advantage of R's efficiency with large data-sets by having find_imbalance.r handle all of the samples concurrently (and simply distinguishing samples using a column that contains sample names). This would require converting the input arguments to a file for the script to read.
The current implementation has us calling find_imbalance.r for each sample, which is quite inefficient. We can take advantage of R's efficiency with large data-sets by having find_imbalance.r handle all of the samples concurrently (and simply distinguishing samples using a column that contains sample names). This would require converting the input arguments to a file for the script to read.