Closed cmungall closed 8 years ago
Current plan:
We will do a first pass of the complex creator where these will be class expressions. E.g.
'protein complex' and has-part some P1 and ... and .. has-part some Pn
We will later switch these to individuals, but this will require some changes to the folding code.
Initially, we'll model some of this as a model-level workbench that allows commands to be sent back.
The client will create an expression that looks like this. The GO class will be fixed. The members can be any molecular entity.
{
'type': 'intersection',
'expressions': [
{
'type': 'class',
'id': 'GO:0032991'
},
{
"type": "svf",
"property": {
'type': "property",
'id': "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000001"
}
},
{
"type": "svf",
"property": {
'type': "property",
'id': "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000002"
}
},
{
"type": "svf",
"property": {
'type': "property",
'id': "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000099"
}
}
]
}
The mechanism is already easily done with the class-expression library: https://github.com/berkeleybop/class-expression Really all that's needed here is the form (including add N parts and clearing). Not a particularly tall order (although I'll probably grab a newer framework to do it).
This should now be publicly available. I'm having a little trouble filter for the bioentities over CHEBI, but it might just be my browser, will check later.
Select a model, pull down workbenches, select macromolecular model creator.
Can we make field 1 have a default value "protein complex". This is what;s used in the majority of cases.
Something not right with field 2. Should act just like the enabled_by field. I can't seem to select anything starting "abcb" right now. I suspect it's not using the noctua-golr solr instance
Weirdly, there are some Shh genes, but not all.
OWL:
Individual: <http://model.geneontology.org/5662325600000018/5662325600000019>
Annotations:
<http://geneontology.org/lego/hint/layout/x> "75"^^xsd:string,
<http://purl.org/dc/elements/1.1/contributor> "http://orcid.org/0000-0002-6601-2165"^^xsd:string,
<http://purl.org/dc/elements/1.1/date> "2015-12-07"^^xsd:string,
<http://geneontology.org/lego/hint/layout/y> "75"^^xsd:string
Types:
<http://identifiers.org/uniprot/Q15465>
and <http://purl.obolibrary.org/obo/GO_0043234>
and <http://www.informatics.jax.org/accession/MGI:MGI:98297>
Two independent issues:
protein-complex and P1 and P2
. Should be 'protein-complex and (has-part some P1) and (has-part some P2)`We can work with 2 for now. But we need to hook up the complex creator to the correct solr
However, we can work with the above for demo purposes for now.
For your first comment, I will look at getting a default value there (and fixing the spinner).
Your second comment is actually answered by https://github.com/geneontology/noctua/pull/239#event-484568713 , which I'll look at switching now.
I don't understand the "older version of noctua-golr" comment.
The "has-part" should be committed now; will test and roll out in a bit.
By the way, where are you getting that JSON from? If it's code we have control over, it's not technically correct:
{
"type": "intersection",
"expressions": [
{
"type": "class",
"id": "GO:0032991"
},
{
"type": "svf",
"property": {
"type": "property",
"id": "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000001"
}
},
{
"type": "svf",
"property": {
"type": "property",
"id": "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000002"
}
},
{
"type": "svf",
"property": {
"type": "property",
"id": "BFO:0000051"
},
"filler": {
"type": "class",
"id": "UniProtKB:P0000099"
}
}
]
}
"older version of noctua-golr"
This may be a diversion. In the previous neo, they looked like http://www.informatics.jax.org/accession/MGI:MGI:98297
In the current one they look like: http://purl.obolibrary.org/obo/MGI_MGI%98297
However, the conversion to the older form may be from the roundtrip through minerva. @hdietze can confirm
By the way, where are you getting that JSON from
I authored it in emacs and didn't validate. Was intending to make a PR with a test for bbop-class-expression but didn't get that far
Your second comment is actually answered by #239 (comment
Not seeing the connection but maybe doesn't matter
Yes the http://www.informatics.jax.org/accession/MGI:MGI:98297
is coming from the curie handler, which has the old(?) mapping. If the request contains only the short-form (i.e. when done from any Golr/AmiGo source), the curie handler does expand MGI:
prefix into the jax long-form.
For the JSON: good--I just wanted to make sure some of our code wasn't producing that.
I believe the svfs are properly in there now (they didn't register the first time I looked at your spec).
Production should be filtering correctly now, and the choices do come up...eventually. However, I'm having trouble with some selectize quirks where hits don't always show up at first. I'll look at that along with the default choice and the spinner.
Nested structure looks fine now, only UI quirks remaining
intersection[3]
is a bit opsqueEach of these may be minor UI issues that should be in theor own tix
1 is the autocomplete issues mentioned here: https://github.com/geneontology/noctua/issues/169#issuecomment-162699191 . Again, they /do/ show up eventually. Not sure what's up yet.
2 This was done in the past, and looked so bad/took so much space that it was dropped by common agreement.
3 It's weird that's not gotten on the round trip...maybe @hdietze would have some idea?
@cmungall Should BFO:0000051 be RO:0002180?
TODO: need to fix login/token info on workbench (keeps token, but doesn't resolve...).
Deployed now. Probably works. See what happens.
It works!
We want to allow protein complexes to be the value/filler of enabled_by slots, as well as simple gene products.
Protein complexes can be pre-composed (e.g. in PRO or Intact) or post-composed.
Tickets open about being able to use pre-composed complex: #122 #120 -- this is essentially a matter of adding the relevant stuff to the import chain and/or golr (and possibly relaxing the autocomplete constraint on the enabled_by box).
Not all complexes will be composed in advance, so we need a way to construct them.
Typically complexes will be mereological sums of proteins. We write these here for convenience as
P1 + P2 + ... + Pn
. Formally this is similar to an OWL class expression'macromolecular complex and has_member P1 and has_member ...
(not equivalent, as the OWL expression is not closed).In some case you may want to describe individual members of the complex in different states (e.g. phosophorylated). We call this the advanced case, and focus on the simple case here.
There are various options for the UI. First it's important to note that there are different ways in which an enabled_by field can be filled in
It is actually possible to do complexes now using route 3: simply create an instance of 'protein complex', create N instances of gene products P1, ..., Pn, connect them to the parent complex via part_of, then connect the complex to the activity via enabled_by. But this is a bit low level (and additionally folding is not invoked). Also we end up with more individuals, we may want to use the equivalent class expression (@hdietze to comment).
We should have a way of doing this using 1 and/or 2. It should probably be consistent across both (although 2 is inherently a more generic UI component).
One would be to allow the enabled_by slot in the wizard to take multiple values, where this is implicitly a mereological sum. But this may not work well with autocomplete, which expects a single value (there is a secret OWL tunnel here at the moment, any manchester expression can be entered).
Another would be some kind of
+
symbol below the slot that allows for multiple values, and some logic that treats these as the appropriate intersection of some values froms.