Closed dkoslicki closed 4 months ago
@chunyuma let me know if you don't have the bandwidth to work on this and I can assign someone else
Hi @dkoslicki, I can work on this but this seems not just relevant to overlay(action= fisher_exact_test)
. I might need to ask the suggestion from @amykglen.
Hi @amykglen, based on @dkoslicki's suggestion, do you think inferring node type from the preferred category can achieve within expand()
? Will this affect anything? If it will, I can call NodeSynonymizer
to infer preferred node type within fisher_exact_test
.
that seems fine to me, @chunyuma. can't think of any problems it could create.
Ah, I didn't know if this involved expand
or not. Maybe that was one of the attributes that got dropped in #1359 ? If not, hopefully it would be a simple attribute lookup
nope, category wasn't dropped in #1375/#1359.
@amykglen, I remember I asked you a similar question before. And you said that if the user didn't set the node type, the expand
would not automatically assign a node type to it, right? I don't remember what reason for this. If you still don't agree that we should automatically assign a node type to the query node without node type, I can infer it only within fisher_exact_test
. But it should not affect query graph. How do you think about this? Thanks!
ah, I see. that sounds familiar, but I think the system has changed so much it's probably not relevant anymore.
we could make ARAX_expander.py
add the preferred category to the QG. but should this alter the message.query_graph
(that all other modules use)? or only the copy of the QG that expand
uses? if the former, does it make sense for that to happen in expand
, or should it happen upstream? (like in add_qnode()
?)
I think the safe way to add preferred category is to only infer it within fisher_exact_test
if other modules don't need this information. If other modules also need this information, perhaps adding to upstream modules might be better. I have no idea.
I kinda like the idea of having it in add_qnode
, so everything downstream will know as much info as possible (will help some KP's that require categories specified). A while ago, I thought about this for add_qnode
by name as well: before we had the Synonym
tab, I wished there was a way to get the curie and category from the name so other KP's could use it.
I think the safe way to add preferred category is to only infer it within
fisher_exact_test
if other modules doesn't need this information. If other modules also need this information, perhaps adding to upstream modules might be better. I have no idea.
@amykglen with the new intelligent KP selection in expand
, would adding the category (and possibly curie) let other KP's be invoked? I recall some of them requiring this info
@amykglen with the new intelligent KP selection in expand, would adding the category (and possibly curie) let other KP's be invoked? I recall some of them requiring this info
yeah, I think that it would - I think in the past it wasn't necessarily a good thing to infer the category for qnodes because the 'preferred category' wasn't as trustworthy and many KPs were sensitive to category, so you might end up getting no results by inferring it, but I think now it would be a good thing to do.
I kinda like the idea of having it in add_qnode, so everything downstream will know as much info as possible (will help some KP's that require categories specified). A while ago, I thought about this for add_qnode by name as well: before we had the Synonym tab, I wished there was a way to get the curie and category from the name so other KP's could use it.
You can add_qnode() with a name: add_qnode(name=acetaminophen) works. I think there was/is category assigning code in add_qnode() but it was disabled for the reason @amykglen describes. There is also an ongoing discussion in TRAPI about what to do when both an id and a category is supplied (and the category is not the same as the KG expects). Not clear.
Do we have any decisions on this issue? Should we need to assign a node type in add_qnode
or in fisher_exact_test
?
It seems like this issue is related to issue #1817. In issue 1817, our conclusion is to assign NamedThing
category to the qnode that is not specified with a specific node type. So I will just follow the idea of issue 1817 to fix this issue as well.
This issue has been resolved. Now the FET will set the node without a specified node type to 'biolink:NamedThing' and use it to calculate the FET p-value. This change needs to be further tested in production. Once it works, then I think we can close this issue.
@chunyuma - I think the ideal solution would be to use the 'preferred' category for a given curie according to the Node Synonymizer, rather than just assigning NamedThing (for pinned query nodes). I think it makes sense to do that within FET rather than add_qnode since I'm not aware of other modules that need that information at this point.
Thanks @amykglen. I change it to use the 'preferred' category if it does have a preferred category returned by the Node Synonymizer, otherwise just assign NamedThing to it.
close it as it has been completed.
For this query:
I get the following error:
2021-04-21T21:00:21.421021 INFO: Performing Fisher's Exact Test to add p-value to edge attribute of virtual edge 2021-04-21T21:00:21.843015 ERROR: [UnknownError] Subject node with qnode key n0 was set to None in Query Graph. Please specify the node type
Even though the category wasn't specified, it really should infer it from the preferred category.