lanagarmire / lilikoi2

GNU General Public License v3.0
2 stars 4 forks source link

lilikoi.PDSfun() designating 0.0 and 1.0 values for all pathways #5

Open RPevsner opened 1 year ago

RPevsner commented 1 year ago

Hi,

I have been able to successfully run the example data using the user guide, but have been getting strange outputs for the PDSfun() step when running my own dataset (attached). When examining the PDSmatrix, values for each pathway broadly increase from the left-most column to the right-most (with a value of 0.0 and 1.0 consistently being seen in the farthest left and right positions respectively). This pattern appears to occur regardless of sample order in the initial csv.

This oddity in PDSmatrix was first noticed when featureSelection() suggested >90% of all pathways were deemed significant with a threshold of 0.9.

I don't believe that the issue is with the initial data as analysis with other tools (Metaboanalyst & omu) has run smoothly with a few significant pathways.

Many thanks for your help


loaded_data <- lilikoi.Loaddata("./input_data/working_data.csv")

dataSet <- loaded_data$dataSet Metadata <- loaded_data$Metadata Metadata$Label <- as.factor(Metadata$Label)

pathway_table <- lilikoi.MetaTOpathway('name')

Metabolite_table= pathway_table$table PDSmatrix= lilikoi.PDSfun(Metabolite_table)

selected_pathways_Weka=lilikoi.featuresSelection(PDSmatrix, threshold= 0.9, method="gain") selected_pathways_Weka


PDSmatrix.csv working_data.csv

lanagarmire commented 1 year ago

Hi RPevsner, In the newest version, we have to replace pathifier with pathTracer due to license conflict. You can (1) download the archived version of old lilikoi, or (2) directly use the pathifier package to replace lilikoi.PDSfun as following:

require(pathifier)
qvec <- Metabolite_table
Metadata$Label <- as.factor(Metadata$Label)
phe=(Metadata$Label) %>% as.numeric  %>% -1
newData1=qvec %>% filter(pathway!='NA')%>% select(Query,HMDB)
newData=Metadata[,t(newData1['Query'])]
colnames(newData)=t(newData1['HMDB'])
newData=t(newData)
PDS<-quantify_pathways_deregulation(as.matrix(newData), row.names(newData),lilikoi:::metabolites.list,
                                    lilikoi:::pathway.list,as.logical(phe), attempts = 5, min_exp=0, min_std=0)
qpdmat <- matrix(as.data.frame(PDS$scores), nrow=length(names(PDS$scores)), byrow=TRUE)
colnames(qpdmat) <- colnames(newData)
rownames(qpdmat) <- names(PDS$scores)
mode(qpdmat) <- "numeric"
PDSmatrix <- qpdmat