Closed joewandy closed 4 years ago
Just wondering if activity levels are calculated correctly ..
At this point, measurement_df has already been standardized (0 mean and unit variance). Should it be divided by np.sqrt(len(row_ids)) before the SVD?
measurement_df
np.sqrt(len(row_ids))
pathway_data = measurement_df.loc[row_ids] / np.sqrt(len(row_ids)) # DF selected from peak IDs.
https://github.com/glasgowcompbio/PALS/blob/master/pals/PLAGE.py#L276
PLAGE paper defines
the activity level of a pathway in a given sample j is taken as the coefficient _cj for the first metagene.'
This is c[0] in our code, but below it is also multiplied by the corresponding eigenvalue d[0]. Is this correct?
c[0]
d[0]
pw_act_list.extend(list(c[0] * d[0]))
https://github.com/glasgowcompbio/PALS/blob/master/pals/PLAGE.py#L283
Makes no change to results, but removed from code to make things clearer. Commit 118199150ce342f6c9c115a91a937bb32df5d7eb.
Just wondering if activity levels are calculated correctly ..
At this point,
measurement_df
has already been standardized (0 mean and unit variance). Should it be divided bynp.sqrt(len(row_ids))
before the SVD?https://github.com/glasgowcompbio/PALS/blob/master/pals/PLAGE.py#L276
PLAGE paper defines
This is
c[0]
in our code, but below it is also multiplied by the corresponding eigenvalued[0]
. Is this correct?https://github.com/glasgowcompbio/PALS/blob/master/pals/PLAGE.py#L283