mortazavilab / PyWGCNA

PyWGCNA is a Python package designed to do Weighted Gene Correlation Network analysis (WGCNA)
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad415/7218311
MIT License
209 stars 48 forks source link

getDatTraits method might not be suitable for continuous traits. #86

Closed lwtan90 closed 8 months ago

lwtan90 commented 8 months ago

Hi,

If I follow the tutorial correctly, the following method will not be useful if the sample info has continuous values (such as height, weight). It will create a lot of columns for each unique number.

def getDatTraits(self, metaData):
        data = self.datExpr.obs.copy()[metaData]
        datTraits = pd.DataFrame(index=data.index)
        for i in range(data.shape[1]):
            data.iloc[:, i] = data.iloc[:, i].astype(str)
            if len(np.unique(data.iloc[:, i])) == 2:
                datTraits[data.columns[i]] = data.iloc[:, i]
                org = np.unique(data.iloc[:, i]).tolist()
                rep = list(range(len(org)))
                datTraits.replace(to_replace=org, value=rep,
                                  inplace=True)
            elif len(np.unique(data.iloc[:, i])) > 2:
                for name in np.unique(data.iloc[:, i]):
                    datTraits[name] = data.iloc[:, i]
                    org = np.unique(data.iloc[:, i])
                    rep = np.repeat(0, len(org))
                    rep[np.where(org == name)] = 1
                    org = org.tolist()
                    rep = rep.tolist()
                    datTraits.replace(to_replace=org, value=rep, inplace=True)

        return datTraits

Are there any other function to correlate the module to continuous trais? Thank you.

Wilson

nargesr commented 8 months ago

Hi @lwtan90

you should be able to find your answer by looking at issue #55

Please don't hesitate to reopen this issue if you still have further questions.