YeoLab / flotilla

Reproducible machine learning analysis of gene expression and alternative splicing data
http://yeolab.github.io/flotilla/docs
BSD 3-Clause "New" or "Revised" License
121 stars 26 forks source link

Problems with combat #345

Open olgabot opened 7 years ago

olgabot commented 7 years ago

When I try to run COMBAT using a categorical dataset, I get this error: Here's the command:

corrected = combat(expression_filtered, case_filtered, pd.DataFrame())
print(corrected.shape)
corrected.head()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-35-3e829ff8900f> in <module>()
----> 1 corrected = combat(expression_filtered, case_filtered, pd.DataFrame())
      2 print(corrected.shape)
      3 corrected.head()

/home/obotvinnik/workspace-git/flotilla/flotilla/external/combat.py in combat(data, batch, model, numerical_covariates)
     98 
     99     sys.stderr.write("Standardizing Data across genes.\n")
--> 100     B_hat = np.dot(np.dot(la.inv(np.dot(design.T, design)), design.T), data.T)
    101     grand_mean = np.dot((n_batches / n_array).T, B_hat[:n_batch,:])
    102     var_pooled = np.dot(((data - np.dot(design, B_hat).T)**2),

TypeError: can't multiply sequence by non-int of type 'unicode'

> /home/obotvinnik/workspace-git/flotilla/flotilla/external/combat.py(100)combat()
     98 
     99     sys.stderr.write("Standardizing Data across genes.\n")
--> 100     B_hat = np.dot(np.dot(la.inv(np.dot(design.T, design)), design.T), data.T)
    101     grand_mean = np.dot((n_batches / n_array).T, B_hat[:n_batch,:])
    102     var_pooled = np.dot(((data - np.dot(design, B_hat).T)**2),

This is originally just code copied from combat.py and I suggest removing it because it's not something I want to maintain