Closed jvwong closed 3 years ago
@jvwong Is the evidence code that you mention like the eco
in the following?
<bp:RelationshipXref rdf:ID="RelationshipXref_3595accc-f8ba-485c-abae-5b241ba43b55http___www_humanmetabolism_org__relationshipxref_1089218069">
<bp:comment rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">REPLACED http://pathwaycommons.org/pc12/RelationshipXref_3595accc-f8ba-485c-abae-5b241ba43b55http___www_humanmetabolism_org__relationshipxref_1089218069</bp:comment>
<bp:id rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">ECO:0000000</bp:id>
<bp:db rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">evidence code ontology</bp:db>
</bp:RelationshipXref>
@jvwong Is the evidence code that you mention like the
eco
in the following?<bp:RelationshipXref rdf:ID="RelationshipXref_3595accc-f8ba-485c-abae-5b241ba43b55http___www_humanmetabolism_org__relationshipxref_1089218069"> <bp:comment rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">REPLACED http://pathwaycommons.org/pc12/RelationshipXref_3595accc-f8ba-485c-abae-5b241ba43b55http___www_humanmetabolism_org__relationshipxref_1089218069</bp:comment> <bp:id rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">ECO:0000000</bp:id> <bp:db rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">evidence code ontology</bp:db> </bp:RelationshipXref>
Not sure what you are asking. The evidence codes for PhosphoSitePlus are from MI ontology
<bp:EvidenceCodeVocabulary rdf:ID="EvidenceCodeVocabulary_8e1e80bfab79984903b7e3804345048e">
<bp:xref rdf:resource="#UnificationXref_molecular_interactions_ontology_MI_0421" />
<bp:term rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">identification by antibody</bp:term>
</bp:EvidenceCodeVocabulary>
...
<bp:EvidenceCodeVocabulary rdf:ID="EvidenceCodeVocabulary_71fe7be44e80879bb8505ec68b171f5c">
<bp:xref rdf:resource="#UnificationXref_molecular_interactions_ontology_MI_0113" />
<bp:term rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">western blot</bp:term>
</bp:EvidenceCodeVocabulary>
Not sure what you are asking. The evidence codes for PhosphoSitePlus are from MI ontology
Okay, actually it was obvious. I was actually looking for something similar but for some reason I got confused later. Now, I see that.
As reflected in biopax section of factoid binary interactions document: to reflect a factoid interaction we sometimes create a biopax interaction and another one that is the controlled of that interaction. Therefore, I selected the biopax interactions who is not the controlled of another biopax interaction as my root interactions. Then, looked for some patterns starting from these roots. Therefore, I also make the evidence/evidence code based filtering on these root interactions.
The problem is that: in the PhosphoSitePlus biopax file there are some interactions associated with evidences but all of these are controlled of some other interactions. In another word, none of the root interactions have any evidence. Therefore, I end up with having no interaction passing through the filter.
Does my approach to get started with root interactions sounds right? If that sounds right should I consider checking the evidence of controlled interactions as well when the root interaction does not have any evidence?
A lot of candidates probably are controller-controlled. Have you revisited the 'Factoid binary interaction types' document. That describes the conversion rules. You'd just have to go in the opposite direction.
PSP puts all the PublicationXref and Evidence attributes on the controlled interaction.
A lot of candidates probably are controller-controlled. Have you revisited the 'Factoid binary interaction types' document. That describes the conversion rules. You'd just have to go in the opposite direction.
Yes, I did so.
PSP puts all the PublicationXref and Evidence attributes on the controlled interaction.
Okay, I was thinking that they would be in the controller. Then, I can check the controlled interaction.
PSP puts all the PublicationXref and Evidence attributes on the controlled interaction.
Considering the factoid binary interaction types there is a type of MolecularInteraction
which cannot have a controlled. In this case if I eliminate the interactions that does not have evidence then I will always be skipping MolecularInteraction
. Also, since you mentioned that the PublicationXref is also stored in the controller, we will be skipping that anyways. Should I delete the case that considers the MolecularInteraction
since it will never work in practice?
PSP puts all the PublicationXref and Evidence attributes on the controlled interaction.
Considering the factoid binary interaction types there is a type of
MolecularInteraction
which cannot have a controlled. In this case if I eliminate the interactions that does not have evidence then I will always be skippingMolecularInteraction
. Also, since you mentioned that the PublicationXref is also stored in the controller, we will be skipping that anyways. Should I delete the case that considers theMolecularInteraction
since it will never work in practice?
The important criteria are:
I've tried to configure the 'master.factoid.baderlab.org' instance to the best of my knowledge but you should double check:
https://github.com/BaderLab/sysadmin/blob/master/websites/factoid.md#master-instance-settings
Looks OK
@metincansiper I am trying to test this locally. Is there any specific setup requirements that you have used to test this?
This is mine:
factoid branch: unstable
factoid-converters branch: master
grounding-search branch: master
URL: https://www.pathwaycommons.org/archives/PC2/v12/PathwayCommons12.psp.BIOPAX.owl.gz
I was getting some weird errors (creating a bunch of 'secret' tables) so just need to know if there are any details.
There two errors I'm seeing upon POST to /api/document
:
In this case the app throws a bunch of Reql errors and seems to try to create multiple secret tables.
info: Updating document-level related papers for doc 77246bc4-58a0-45b5-9cb7-dc27912ebd15
info: POST /api/document 200 3208.757 ms - 36332
error: Error getRelPprsForDoc: HTTPStatusError: Too Many Requests (429)
info: Updating network-level related papers for doc 77246bc4-58a0-45b5-9cb7-dc27912ebd15
error: Aggregate get failed
error: FetchError: invalid json response body at http://localhost:3011/get reason: Unexpected end of JSON input
at /.../factoid/node_modules/node-fetch/lib/index.js:272:32
at runMicrotasks (<anonymous>)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
at async Promise.all (index 0)
....
@jvwong I see a similar error. Something maybe broken I am looking into it.
@jvwong I found out the problem causing the error of FetchError: invalid json response body at http://localhost:3011/get reason: Unexpected end of JSON input
. Grounding search get service is giving error for some option/options. The one I detected is {"id":"P29678","namespace":"uniprot"};
. It works well for the options {"id":"16818","namespace":"ncbi"}
. I do not know if it is a problem specific to uniprot entities.
Would there be an issue with the grounding search service. I used the following code to try things out independent from the factoid code:
const fetch = require('node-fetch');
// working:
const opts = {"id":"16818","namespace":"ncbi"};
// error:
// const opts = {"id":"P29678","namespace":"uniprot"};
const url = 'http://localhost:3002/get';
fetch( url, {
method: 'POST',
body: JSON.stringify(opts),
headers: {
'Content-Type': 'application/json'
}
} )
.then( res => res.json() )
.then( console.log );
The UniProt accession P29678 points to a organism 'rabbit' https://www.uniprot.org/uniprot/P29678 which is not among the supported organisms. The stuff from PSP BioPAX is all human, so looks like some mapping went wrong?
The UniProt accession P29678 points to a organism 'rabbit' https://www.uniprot.org/uniprot/P29678 which is not among the supported organisms. The stuff from PSP BioPAX is all human, so looks like some mapping went wrong?
@jvwong since it has the uniprot namespace it must not be the result of mapping but the it must be the input for the mapping. Therefore, I suspect if it would actually be coming from PSP file in a way. However, the file cannot be downloaded now. Is it related to the network issue that you mentioned?
Yeah all our VMs are totally messed up including PC. Ill let you know when its back up but might be a few days?
On Mon, Jan 4, 2021 at 4:11 PM metincansiper notifications@github.com wrote:
The UniProt accession P29678 points to a organism 'rabbit' https://www.uniprot.org/uniprot/P29678 which is not among the supported organisms. The stuff from PSP BioPAX is all human, so looks like some mapping went wrong?
@jvwong https://github.com/jvwong since it has the uniprot namespace it must not be the result of mapping but the it must be the input for the mapping. Therefore, I suspect if it would actually be coming from PSP file in a way. However, the file cannot be downloaded now. Is it related to the network issue that you mentioned?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/PathwayCommons/factoid/issues/836#issuecomment-754221557, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABD5AA4IIIHKT72URNDMOILSYIVJ3ANCNFSM4RU4P5ZQ .
The UniProt accession P29678 points to a organism 'rabbit' https://www.uniprot.org/uniprot/P29678 which is not among the supported organisms. The stuff from PSP BioPAX is all human, so looks like some mapping went wrong?
@jvwong I found it in PSP Biopax file:
<bp:UnificationXref rdf:ID="UnificationXref_uniprot_knowledgebase_P29678">
<bp:id rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">P29678</bp:id>
<bp:db rdf:datatype = "http://www.w3.org/2001/XMLSchema#string">uniprot knowledgebase</bp:db>
</bp:UnificationXref>
OK I see an example of this: https://apps.pathwaycommons.org/pathways?uri=http://pathwaycommons.org/pc12/Catalysis_61738de4a156b1f6b48691e258df24bb
@jvwong then should we just skip the non human genes?
OK I see an example of this: https://apps.pathwaycommons.org/pathways?uri=http://pathwaycommons.org/pc12/Catalysis_61738de4a156b1f6b48691e258df24bb
@jvwong then should we just skip the non human genes?
We do support some non-human species (below).
onst SORTED_MAIN_ORGANISMS = [
new Organism(2697049, 'SARS-CoV-2'),
new Organism(227984, 'SARS-CoV'),
new Organism(9606, 'Homo sapiens'),
new Organism(10090, 'Mus musculus'),
new Organism(ROOT_STRAINS.SCERVISIAE, 'Saccharomyces cervisiae', SCERVISIAE_STRAIN_IDS),
new Organism(7227, 'Drosophila melanogaster'),
new Organism(ROOT_STRAINS.ECOLI, 'Escherichia coli', ECOLI_STRAIN_IDS),
new Organism(6239, 'Caenorhabditis elegans'),
new Organism(3702, 'Arabidopsis thaliana'),
new Organism(10116, 'Rattus norvegicus'),
new Organism(7955, 'Danio rerio')
];
We do support some non-human species (below).
Okay, I will make the filtering for that organism assuming that grounding search would work fine with them all.
@jvwong I made some work to eliminate the unsupported organisms (I did not commit it yet). However, grounding search is still not working for some supported organisms. One example I found is this gene with human organism: https://www.uniprot.org/uniprot/P06213 ({"id":"P06213","namespace":"uniprot"}
). Also, the search for ncbi is also failing in some cases. The cases I checked were having human organism: (https://www.ncbi.nlm.nih.gov/gene/1956 and https://www.ncbi.nlm.nih.gov/gene/4610).
What are you trying to search for again?
https://grounding.baderlab.org doesn't seem like it's working and https://master.grounding.baderlab.org/ looks like it's down. Maybe there's another issue with the cluster
What are you trying to search for again?
I am getting the ncbi grounding for the genes coming from the PSP file. First I make a mapping to ncbi id then I make call the grounding search for the mapped ncbi id. In case the original xref is from uniprot I am making the uniprot the ncbi mapping using the grounding search as well. I can tell that the problem is grounding search is failing for some parameters like {"id":"P06213","namespace":"uniprot"}
, {"id":"1956","namespace":"ncbi"}
, {"id":"4610","namespace":"ncbi"}
. All of which represents human genes.
https://grounding.baderlab.org doesn't seem like it's working and https://master.grounding.baderlab.org/ looks like it's down. Maybe there's another issue with the cluster
I am also seeing that https://master.grounding.baderlab.org/ is being down frequently. Therefore, I was running grounding search in my localhost.
https://grounding.baderlab.org doesn't seem like it's working and https://master.grounding.baderlab.org/ looks like it's down. Maybe there's another issue with the cluster
I am also seeing that https://master.grounding.baderlab.org/ is being down frequently. Therefore, I was running grounding search in my localhost.
It's timing out when trying to download ncbi/uniprot/chebi data. I'll put something up ASAP.
BTW the code updates I made to filter out the unsupported organisms are in this branch (https://github.com/PathwayCommons/factoid/tree/supplement_db_filter). I did not make a PR since the filtering does not look enough for now as I mentioned as:
I made some work to eliminate the unsupported organisms (I did not commit it yet). However, grounding search is still not working for some supported organisms. One example I found is this gene with human organism: https://www.uniprot.org/uniprot/P06213 ({"id":"P06213","namespace":"uniprot"}). Also, the search for ncbi is also failing in some cases. The cases I checked were having human organism: (https://www.ncbi.nlm.nih.gov/gene/1956 and https://www.ncbi.nlm.nih.gov/gene/4610).
BTW the code updates I made to filter out the unsupported organisms are in this branch (https://github.com/PathwayCommons/factoid/tree/supplement_db_filter). I did not make a PR since the filtering does not look enough for now as I mentioned as:
I made some work to eliminate the unsupported organisms (I did not commit it yet). However, grounding search is still not working for some supported organisms. One example I found is this gene with human organism: https://www.uniprot.org/uniprot/P06213 ({"id":"P06213","namespace":"uniprot"}). Also, the search for ncbi is also failing in some cases. The cases I checked were having human organism: (https://www.ncbi.nlm.nih.gov/gene/1956 and https://www.ncbi.nlm.nih.gov/gene/4610).
I can't reproduce your 'grounding search' errors. These all work fine for me.
Yes I they worked for me too. I remember like they were not working in the pas t but maybe I just made something wrong in the past. I do not know.
I tried this PhosphoSitePlus BioPAX conversion with your new branch and a local instance of everything (grounding, converter, index, db). INDRA can't handle all the requests and ends up complaining about too many requests (429), then to 'Unavailable' the to Internal Server error.
@jvwong @maxkfranz the speed was not so important for this service. Am I right about that?
No speed doesn't matter at all. Maybe there's a way to filter the documents a little more as well.
@jvwong I am looking into Too Many Requests (429)
but while doing that I again coincided some cases where grounding service is failing. One of them is {"id":"P00517","namespace":"uniprot"}
. The cases I reported as not working in the past is working for me now but I see that new cases not working now.
P00517
That's unsupported organism: Bos taurus (Bovine) https://www.uniprot.org/uniprot/P00517
That's unsupported organism: Bos taurus (Bovine) https://www.uniprot.org/uniprot/P00517
Yes, sorry for the confusion. I checked it and see that the I mistakenly undid the code I added for organism filtering and I did not check the organism since I was relying on the organism filtering.
I created #940 referencing this issue.
Description
Q: What is the name of the feature?
A: Add to Factoid database documents, created from an external (existing) database
Q: What does this feature enable the user to do?
A: Access more (expert-)curated interactions supported by literature via Factoid
Q: What are the applicable constraints, e.g. compatibility or performance?
Q: How does this feature affect each class of user (persona)?
There are a few benefits to introducing external data into Factoid database:
a) 'Frame-out' the Factoid app with useful, quality, curated data b) A lot is data for older articles that wouldn’t otherwise be curated c) Can serve as pre-made Factoids that we can ask the authors to ‘verify’ d) Begins the idea of unifying PC and Factoid (search/import)
Specification
Details
My first thought is to use PhosphositePlus (PSP) PC BioPAX
bp:Evidence
withbp:evidenceCode
one ofFactoid documents are one-to-one with articles, and interactions are added to an article/document. So in this case, interactions would have to be grouped under the same PMID. Not sure how many PSP interactions fall under the same PMID, and I wouldn't merge genes/interactions in output
doc.fromJson(obj)
id
: new uuidelements
: new uuidsecret
(might be generated for you)elements
field so you can specify the proteins and interactions (elements
should just be in the json format)doc.fromJson()
Elements
using the element APIelements
jsonhttps://github.com/PathwayCommons/factoid/blob/d145d5735e34e719acd8ed380786e9f6bec79a86/src/server/routes/api/document/index.js#L1239-L1265
Notes
Document
)