Closed Rohit-Satyam closed 2 months ago
Kindly note that this issue is limited to current release you pushed 4 days ago and is resolved by downgrading the package to version pathfindR.data_2.1.0
and pathfindR_2.3.1
. The old function was faster as well.
Besides, I observed that pathFindR is discarding many genes in my analysis. We know from experience that the PPI interaction in the string for plasmodium is sparse and is mostly based on coexpression (also see this discussion). This leads to filtering of most of the interaction even when I use a lower combined_score cut-off value of 400 and thereby discards nearly 43% of my genes (see log below) when I run pathfindR
So do these genes that are not found in PIN undergo enrichment analysis or are discarded?
Number of genes in input after p-value filtering: 1412
pathfindR cannot handle p values < 1e-13. These were changed to 1e-13
Could not find any interactions for 604 (42.78%) genes in the PIN
Final number of genes in input: 808
So is it like pathfindR is not useful for cases/organisms where the interaction data is not well established?
Hey @Rohit-Satyam.
Related to the bug, I might have introduced it when I updated the relevant function in the last release, will investigate and try to resolve it.
Regarding your second comment, pathfindR is a tool for active-subnetwork search and then enrichment. As such, it does have a limitation that it will not be able perform as well in cases with lower number interactions in the protein interaction network, e.g. in your case for Plasmodium. However, the results should nonetheless be reliable.
This is a case of over-engineering a function, it first fetches the KEGG IDs for pathway genes from KEGG, then tries to convert the KEGG IDs to gene names using other data from KEGG. The conversion is not its responsibility. Therefore, I will remove the conversion functionality and return KEGG IDs (per previous behavior as well), so the user can convert the identifiers using a more appropriate tool (e.g. biomart) if they wish. I'll update here once the implementation is finished.
The fixed function is now available in the development version of pathfindR
, you can install this via:
install.packages("devtools") # if you have not installed "devtools"
devtools::install_github("egeulgen/pathfindR")
I will try to release a patch version (i.e. pathfindR 2.4.1
) on CRAN soon.
Describe the bug The bug is when I run the following code I get a list of pathways with gene descriptions rather than gene IDs for Pfalciparum
rather than returning PF3D7 IDs. This organism doesn't have proper gene symbols so mostly PF3D7 ensemble IDs are frequently used
To Reproduce
Expected behavior The expected behavior is a list of lists containing the ensemble gene IDs
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
R Session Information: