egeulgen / pathfindR

pathfindR: Enrichment Analysis Utilizing Active Subnetworks
https://egeulgen.github.io/pathfindR/
Other
178 stars 25 forks source link

Error in pathfindR with finding gene set #153

Closed richardcoca closed 1 year ago

richardcoca commented 1 year ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Prepare input as 'library(pathfindR)

create input data frame

input_df<-#imported excel file with three columns output_df <- run_pathfindR(input_df, 0.005)'

  1. See error

Error in pathfindR::fetch_gene_set(gene_sets = gene_sets, min_gset_size = min_gset_size, : gene_sets should be one of “KEGG”, “Reactome”, “BioCarta”, “GO-All”, “GO-BP”, “GO-CC”, “GO-MF”, “cell_markers”, “mmu_KEGG”, “Custom”

Expected behavior I didn't get a pathway generated.

Desktop (please complete the following information):

R Session Information: R version 4.0.3 (2020-10-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Big Sur 10.16

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

Context If I rewrite syntax to specify gene set, it gives me non-numerical p-value error

egeulgen commented 1 year ago

this occurred because the second argument of run_pathfindR() is gene_sets. Please try run_pathfindR(input_df, p_val_threshold = 0.005 and it should work

richardcoca commented 1 year ago

Hi,

I seem to get an error saying all p-values must be numeric immediately after. Is this referring to the p-values in my data frame or the inputted p-value threshold? (They're in scientific notation.)

R script: library(pathfindR) ?fetch_gene_set

create input data frame

install.packages("readxl") library("readxl")

This is the spreadsheet from the mass-spec core

nek9<-read_excel("/Users/richardcoca/Library/Containers/com.microsoft.Excel/Data/Desktop/nek9_apms_result (3).xlsx")

Let's look at our hek293t cell proteomics data set by making it a separate data frame

library("tidyverse")

hek293<-select(nek9, Right, padj_chisq_RPE, lfc_293)

output_df <- run_pathfindR(hek293, p_val_threshold = 0.005)


From: Ege Ulgen @.> Sent: Saturday, January 28, 2023 1:10 AM To: egeulgen/pathfindR @.> Cc: Richard A Coca @.>; Author @.> Subject: Re: [egeulgen/pathfindR] Error in pathfindR with finding gene set (Issue #153)

this occurred because the second argument of run_pathfindR() is gene_sets. Please try run_pathfindR(input_df, p_val_threshold = 0.005 and it should work

— Reply to this email directly, view it on GitHubhttps://github.com/egeulgen/pathfindR/issues/153#issuecomment-1407347239, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A4EOC33A37GXKNFPANJ2W5LWUTO7RANCNFSM6AAAAAAUIKR7AI. You are receiving this because you authored the thread.Message ID: @.***>

egeulgen commented 1 year ago

The error refers to the p-values in the data frame. I think the order is wrong. It should be gene name, lfc, p value

richardcoca commented 1 year ago

I swapped the order and made sure to convert p-values that were in scientific notation to regular number formatting, but still get the error.

R script install.packages("pak") # if you have not installed "pak" pak::pkg_install("pathfindR")

library(pathfindR)

create input data frame

install.packages("readxl") library("readxl")

This is the spreadsheet from the mass-spec core

nek9<-read_excel("/Users/richardcoca/Library/Containers/com.microsoft.Excel/Data/Desktop/nek9_apms_result (3).xlsx")

Let's look at our hek293t cell proteomics data set by making it a separate data frame

library("tidyverse")

hek293<-select(nek9, Gene_symbol, logFC, FDR_p)

Remove non-numeric p values

hek293n <- hek293[!is.na(as.numeric(hek293$FDR_p)),]

Remove na

hek293df<-na.omit(hek293n)

Check to make sure values are numeric

sapply(hek293df,class)

output_df <- run_pathfindR(hek293df, p_val_threshold = 0.005)

Console

nek9<-read_excel("/Users/richardcoca/Library/Containers/com.microsoft.Excel/Data/Desktop/nek9_apms_result (3).xlsx")

Let's look at our hek293t cell proteomics data set by making it a separate data frame

library("tidyverse") hek293<-select(nek9, Gene_symbol, logFC, FDR_p)> #Remove non-numeric p values> hek293n <- hek293[!is.na(as.numeric(hek293$FDR_p)),]>

Remove na

hek293df<-na.omit(hek293n)

Check to make sure values are numeric> sapply(hek293df,class)

Gene_symbol logFC FDR_p "character" "numeric" "numeric" > output_df <- run_pathfindR(hek293df, p_val_threshold = 0.005) There is already a directory named "pathfindR_Results".Writing the result to "pathfindR_Results(22)" not to overwrite any previous results.

Testing input

Error in pathfindR::input_testing(input, p_val_threshold) : p values must all be numeric>


From: Ege Ulgen @.> Sent: Monday, January 30, 2023 1:00 AM To: egeulgen/pathfindR @.> Cc: Richard A Coca @.>; Author @.> Subject: Re: [egeulgen/pathfindR] Error in pathfindR with finding gene set (Issue #153)

The error refers to the p-values in the data frame. I think the order is wrong. It should be gene name, lfc, p value

— Reply to this email directly, view it on GitHubhttps://github.com/egeulgen/pathfindR/issues/153#issuecomment-1408216358, or unsubscribehttps://github.com/notifications/unsubscribe-auth/A4EOC323JLDWEQFO36XWGRTWU57LDANCNFSM6AAAAAAUIKR7AI. You are receiving this because you authored the thread.Message ID: @.***>