RTXteam / RTX

Software repo for Team Expander Agent (Oregon State U., Institute for Systems Biology, and Penn State U.)
https://arax.ncats.io/
MIT License
33 stars 21 forks source link

Some protein nodes are missing descriptions #20

Closed dkoslicki closed 6 years ago

dkoslicki commented 6 years ago

Eg. P08563

dkoslicki commented 6 years ago

Found another such node:

match (n) where n.name="DOID:10923" return n
saramsey commented 6 years ago

confirmed this bug in the current dev KG (in container rtxsteve on rtxdev.saramsey.org)

saramsey commented 6 years ago

Confirmed fixed in http://rtxsteve.saramsey.org:7474

screen shot 2018-03-29 at 2 56 14 pm

saramsey commented 6 years ago

Working on copying the Neo4j database to the less-expensive "rtxdev" EC2 instance now....

saramsey commented 6 years ago

Updated database has been pushed to http://rtxdev.saramsey.org:7674

screen shot 2018-03-29 at 3 02 02 pm

saramsey commented 6 years ago

all drugs now have descriptions:

screen shot 2018-03-29 at 3 06 30 pm

saramsey commented 6 years ago

Looks like 303 protein nodes lack descriptions. Not clear from a search of the code-base where these are coming from. More investigation needed: screen shot 2018-03-29 at 3 09 52 pm

dkoslicki commented 6 years ago

Note also that some of the target edges are missing probabilities:

match p=(n:pharos_drug)-[t:targets]-(:uniprot_protein) return t.probability limit 10

t.probability
--
null
null
null
0.03231776208
0.00615026285
7.3254e-7
1
0.00000596881
0.00000803035
0.00000190193
saramsey commented 6 years ago

Can you give me a pair of ChEMBL ID and Uniprot ID?


Stephen Ramsey Assistant Professor, Oregon State University

On Mar 29, 2018, at 3:17 PM, David Koslicki notifications@github.com<mailto:notifications@github.com> wrote:

Note also that some of the target edges are missing probabilities:

match p=(n:pharos_drug)-[t:targets]-(:uniprot_protein) return t.probability limit 10

t.probability

null null null 0.03231776208 0.00615026285 7.3254e-7 1 0.00000596881 0.00000803035 0.00000190193

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/RTXteam/RTX/issues/20#issuecomment-377389503, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTe-9kVeXFSITYGsFmyQ3ro-0enYCUNks5tjV13gaJpZM4SlFJL.

dkoslicki commented 6 years ago
MATCH p=(s:pharos_drug)-[r:targets]->(t:uniprot_protein) where r.probability is Null return s.name, t.name, r.probability limit 10

s.name | t.name | r.probability
-- | -- | --
"CHEMBL1166" | "Q9H244" | null
"CHEMBL1166" | "Q9BQB6" | null
"CHEMBL1166" | "P03952" | null
"CHEMBL1166" | "P38435" | null
"CHEMBL1201244" | "P02708" | null
"CHEMBL1201244" | "P11230" | null
"CHEMBL1201244" | "Q07001" | null
"CHEMBL1201244" | "P08913" | null
"CHEMBL602" | "P13569" | null
"CHEMBL602" | "P37231" | null
dkoslicki commented 6 years ago

@saramsey Closing the loop on this issue: some of the probabilities are still not populated:

match p=(n:chemical_substance)-[t:directly_interacts_with]-(m:protein) where not exists(t.probability) return t.probability, n.description, m.description limit 100

eg edge connecting triclofos and GABRE.

If this is due to Pharos not having the info, then it's fine and we can close this issue. I just wanted to make sure it wasn't a problem with the orangeboard construction itself.

saramsey commented 6 years ago

Yes, this is a Pharos issue (see screen cap). Since we first attempt to get drug-target relationships using ChEMBL before attempting Pharos, not sure what else can be done here.

[cid:F794E69F-2503-471B-86CC-A856E430B859]


Stephen Ramsey Assistant Professor, Oregon State University

On Apr 16, 2018, at 2:22 PM, David Koslicki notifications@github.com<mailto:notifications@github.com> wrote:

@saramseyhttps://github.com/saramsey Closing the loop on this issue: some of the probabilities are still not populated:

match p=(n:chemical_substance)-[t:directly_interacts_with]-(m:protein) where not exists(t.probability) return t.probability, n.description, m.description limit 100

eg edge connecting triclofos and GABRE.

If this is due to Pharos not having the info, then it's fine and we can close this issue. I just wanted to make sure it wasn't a problem with the orangeboard construction itself.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/RTXteam/RTX/issues/20#issuecomment-381754054, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AFTe-y_xdfyGpBo59xJgkLIOvxk2TMvJks5tpQuMgaJpZM4SlFJL.

dkoslicki commented 6 years ago

Fair enough: let's chalk it up to a shortcoming of the KS's. Guess we can close this issue then.