ipb-halle / MetFragR

R package for MetFrag
24 stars 14 forks source link

Error retrieving candidates #21

Closed cmc493 closed 5 years ago

cmc493 commented 6 years ago

Hello,

I have been working with MetFragR without issue for the past few months. However, my code recently stopped working and results in the error message (using the Pubchem database):

Error: Could not open URL connection! https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/listkey/2863889546745629568/cids/TXT Error retrieving candidates java.lang.Exception at de.ipbhalle.metfraglib.functions.HelperFunctions.getInputStreamFromURL(HelperFunctions.java:66) at de.ipbhalle.metfraglib.database.OnlinePubChemDatabase.getCandidateIdentifiers(OnlinePubChemDatabase.java:134) at de.ipbhalle.metfraglib.database.OnlineExtendedPubChemDatabase.getCandidateIdentifiers(OnlineExtendedPubChemDatabase.java:49) at de.ipbhalle.metfraglib.process.CombinedMetFragProcess.retrieveCompounds(CombinedMetFragProcess.java:77) at de.ipbhalle.metfrag.r.MetfRag.runMetFrag(MetfRag.java:123)

When following the URL in a web browser, the following message appears, which suggests a server error but the fault is listed as "Client":

Status: 500 Code: PUGREST.ServerError Message: GetOperationStatus failed Detail: GetOperationStatus fault: Client, message: Unsupported request type

When using the Chemspider database, the following error occurs:

Error retrieving candidates java.lang.Exception at de.ipbhalle.metfraglib.database.OnlineChemSpiderDatabase.getCandidateIdentifiers(OnlineChemSpiderDatabase.java:121) at de.ipbhalle.metfraglib.database.AbstractDatabase.getCandidateIdentifiers(AbstractDatabase.java:29) at de.ipbhalle.metfraglib.process.CombinedMetFragProcess.retrieveCompounds(CombinedMetFragProcess.java:77) at de.ipbhalle.metfrag.r.MetfRag.runMetFrag(MetfRag.java:123) 730499 [main] ERROR de.ipbhalle.metfraglib.database.OnlineChemSpiderDatabase - Error: Could not perform database query. This could be caused by a temporal database timeout. Try again later.

Do you know if this is indeed a server side issue?

Thank you for your time, Corey

c-ruttkies commented 6 years ago

Hi Corey, did you use a valid ChemSpider token? If yes, it could really be related to a temporal timeout of the ChemSpider server. Have it ever worked for one of your queries? Best regards, Christoph

c-ruttkies commented 6 years ago

Could you send me the settings object (without ChemSpider token)?

cmc493 commented 6 years ago

Thank you for the fast reply!

I have been using the PubChem database. Since it stopped working, I obtained a Chemspider token, but it did not work either. Here is the example code that gives me the error; if I include the PrecursorCompoundIDs, the code runs through with the PubChem database, but not Chemspider. The error occurs when it tries to retrieve the candidates online.

#
# first define the settings object
#
settingsObject<-list()
#
# set database parameters to select candidates
#
settingsObject[["DatabaseSearchRelativeMassDeviation"]]<-5.0
settingsObject[["FragmentPeakMatchAbsoluteMassDeviation"]]<-0.001
settingsObject[["FragmentPeakMatchRelativeMassDeviation"]]<-5.0

settingsObject[["MetFragDatabaseType"]]<-"PubChem"
#settingsObject[["MetFragDatabaseType"]]<-"ExtendedPubChem"
#settingsObject[["MetFragDatabaseType"]]<-"ChemSpider"

#
# the more information about the precurosr is available
# the more precise is the candidate selection
#
settingsObject[["NeutralPrecursorMass"]]<-253.966126
settingsObject[["NeutralPrecursorMolecularFormula"]]<-"C7H5Cl2FN2O3"
#settingsObject[["PrecursorCompoundIDs"]]<-c("50465", "57010914", "56974741", "88419651", "23354334")
#
# pre and post-processing filter
#
# define filters to filter unconnected compounds (e.g. salts)
settingsObject[["MetFragPreProcessingCandidateFilter"]]<-c("UnconnectedCompoundFilter","IsotopeFilter")
settingsObject[["MetFragPostProcessingCandidateFilter"]]<-c("InChIKeyFilter")
#
# define the peaklist as 2-dimensional matrix
#
settingsObject[["PeakList"]]<-matrix(c(
90.97445, 681,
106.94476, 274,
110.02750, 110,
115.98965, 95,
117.98540, 384,
124.93547, 613,
124.99015, 146,
125.99793, 207,
133.95592, 777,
143.98846, 478,
144.99625, 352,
146.00410, 999,
151.94641, 962,
160.96668, 387,
163.00682, 782,
172.99055, 17,
178.95724, 678,
178.97725, 391,
180.97293, 999,
196.96778, 720,
208.96780, 999,
236.96245, 999,
254.97312, 999), ncol=2, byrow=TRUE)
#
# run MetFrag
#
scored.candidates<-run.metfrag(settingsObject)
#
# scored.candidates is a data.frame with scores and candidate properties
#
schymane commented 6 years ago

The compound IDs are probably specific to PubChem so they probably wouldn’t work with ChemSpider (or give strange results if they do work). You can get ChemSpider IDs by entering that formula into ChemSpider and just picking some candidate IDs, for instance. What happens if you try to reproduce exactly that query with the MetFragBeta web interface? Does it run? I tested a random query using PubChem quickly earlier and it seemed OK… https://msbi.ipb-halle.de/MetFragBeta/

cmc493 commented 6 years ago

I also tried using the web interface and it works perfectly. This is what prompted me to post here since the issue doesn't seem to be related to the server-side of things, but I was still unsure of what was going wrong.

schymane commented 6 years ago

I am also not sure what is going wrong, do the test MetFrag functions in the ReSOLUTION package work? https://github.com/schymane/ReSOLUTION

See MetFragConfig and runMetFrag – examples are given in the functions (sorry I have to leave for now so can’t try myself)

cmc493 commented 6 years ago

The MetFrag functions in the ReSOLUTION package work. However, when searching for candidates using a chemical formula, the same error as before occurs. I tested the original MetFragR code and it runs through when searching with a NeutralPrecursorMass but not when searching with a NeutralPrecursorMolecularFormula.

c-ruttkies commented 6 years ago

I ran your posted example and it worked nicely:

0 [main] INFO de.ipbhalle.metfraglib.database.OnlinePubChemDatabase - Fetching candidates from PubChem 8357 [main] INFO de.ipbhalle.metfraglib.process.CombinedMetFragProcess - Got 4 candidate(s) 9243 [pool-1-thread-1] INFO de.ipbhalle.metfraglib.process.ProcessingStatus - 30 % 9264 [pool-1-thread-1] INFO de.ipbhalle.metfraglib.process.ProcessingStatus - 50 % 9275 [pool-1-thread-1] INFO de.ipbhalle.metfraglib.process.ProcessingStatus - 80 % 9298 [pool-1-thread-1] INFO de.ipbhalle.metfraglib.process.ProcessingStatus - 100 % 9366 [main] INFO de.ipbhalle.metfraglib.process.CombinedMetFragProcess - 0 candidate(s) were discarded before processing due to pre-filtering 9366 [main] INFO de.ipbhalle.metfraglib.process.CombinedMetFragProcess - 0 candidate(s) discarded during processing due to errors 9366 [main] INFO de.ipbhalle.metfraglib.process.CombinedMetFragProcess - 0 candidate(s) discarded after processing due to post-filtering 9366 [main] INFO de.ipbhalle.metfraglib.process.CombinedMetFragProcess - Stored 4 candidate(s)

Could it be related to your internet connection. Do you maybe need to set proxy settings? Maybe you could test that query again from another network?

Best regards, Christoph

cmc493 commented 6 years ago

Hello,

I got into work this morning and ran the code without changing anything to test it-- it worked perfectly! I guess it was indeed a server-side issue. Thank you both for taking the time to assist me.

Best, Corey

sneumann commented 5 years ago

Great to see this solved then. Yours, Steffen