Closed kltm closed 8 years ago
One option is to use VIVO/GRID: https://grid.ac/downloads
Not clear if this best. It's large, not clear how easy to add, and the entities we care about are not just institutions. We also want to give provenance to specific projects (typically funded).
We will most likely maintain our own list of organizations as a yaml that can be easily updated.
-
id: https://www.ucl.ac.uk/functional-gene-annotation/cardiovascular
label: Cardiovascular Gene Annotation
-
...
with the obvious json-ls/ttl translation. We can add owl:sameAs for GRID IDs if required.
@mellybelly thoughts?
We will use these using our ugly entity-as-literal hack for now (cc @balhoff ), so the ttl would be:
dc:??? "https://www.ucl.ac.uk/functional-gene-annotation/cardiovascular"^^xsd:String
@cmungall I note that db-xrefs.yaml does not currently specify "id", but "name" and "label" serve the same function. Do you think overloading is worth while, or should we kick it out a bit? For users.yaml, I'd propose a new field "organizations", that would replace the current "organization" that would be a list of "id"s found in the file.
this is more analogous to users.yaml (in fact we previously discussed overloading this, as used to be done for ontology provenance). We could use uri
for consistency.
There may be value to keeping one organization as prime/current in the users.yaml even if noctua doesn't use it. We could treat membership as it's own entity
nickname: Midori
roles:
-
type: member
organization: <pombase-uri>
start: ...
end: ..
-
...
overmodeling vs future proofing? Not clear.
I'm not sure I follow: for roles.type, would there be anything besides member in users.yaml? If not, why have the extra information? Also, I'm not sure start/end would be necessary, and would add a lot of maintenance cruft/overhead we wouldn't want to become long-term maintainers of that data. Ideally, we'd know functionally when people were part of an org by the edit times on their operations...which actually is bringing me around on another point that you made...
(For example, we can currently see date and contributor annotations to an individual, but if there are multiple edits to a single individual by multiple people at different time, there is no way to see what was done by who/when. The only way around that would be to spin out (at least) contributor as an individual and then have the date annotations there. Narf.)
On 13 Sep 2016, at 15:45, kltm wrote:
I'm not sure I follow: for roles.type, would there be anything besides member in users.yaml? If not, why have the extra information? Also, I'm not sure start/end would be necessary, and would add a lot of maintenance cruft/overheadwe wouldn't want to become long-term maintainers of that data. Ideally, we'd know functionally when people were part of an org by the edit times on their operations...which actually is bringing me around on another point that you made...
It's at least a natural way to model current/vs not current which is useful, but YMMV
(For example, we can currently see date and contributor annotations to an individual, but if there are multiple edits to a single individual by multiple people at different time, there is no way to see what was done by who/when. The only way around that would be to spin out (at least) contributor as an individual and then have the date annotations there. Narf.)
I'm fine with just a bundle of flat triples here. Anything else would be very complex, not well aligned with other attribution models, and not really required.
You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/geneontology/noctua/issues/350#issuecomment-246849873
Imagine the query: get me all models I worked on for the Heart Foundation for the 2015 fiscal year. Without something richer, we'll be unable to get near anything like that. At least, we'd probably want to store both creation and modification, which would get us a lot more resolution, but still fall short of getting good granular reporting.
As a light first pass, just to have something to play with, how about a files roles.yaml as you described in https://github.com/geneontology/noctua/issues/350#issuecomment-246841631, and an additional (optional?) field for users.yaml: "roles", that references that file?
Final discussion: groups.yaml, and a "groups" listing in users.yaml, the latter being optional for the time being.
Remember ORCID is only an instance identifier. We really need a plan for organization instances too, they are all over the place. GRID is the best out there after 10+ years. They are curating, are collaborative, and coordinating with Wikidata. That said, it seems like you have some non-org types of groups. 'Cardiovascular Gene Annotation' is not really an organization or even a group, its more of a focused effort consisting of a narrower group of GO curators? These probably wouldn't need to go into GRID and could live somewhere else with a landing page/grey lit citation, but if they are really orgs like 'EBI' then they should go into GRID. owl:sameAs for GRID IDs is fine, but please help populate GRID if you can.
I like the yaml model to represent people w different "hats". Its not really so different than the implementation in vivo-ISF, where you can have different roles for different lengths of time in different organizations. Agree here with @kltm though, I don't think you need to declare when they worked in an org, this information is available elsewhere for many people and isn't really all that relevant here.
Regarding other roles: will there be curation roles? QA roles? evaluation roles? New contribution roles can go into the new contribution ontology @marijane, she can also advise on above. Send some more examples.
This is important, as I want to be able to aggregate curation contributions for the ISB website on a per person and/or per org, and/or date range basis.
Currently looking at maybe using "contributor" or "publisher" for groups, but don't have a real good feel for that: "publisher" would be easier for queries down the road, but may be wrong, "contributor" would slot in easier to what we have, and make a certain sense, but disambiguating would be harder. @cmungall will have a think about this.
I would have thought that PROV might help us here. E.g. PROV-O Primer
However, prov:actedOnBehalfOf
seems to assume that people are time-sliced, e.g.
This doesn't help us much...
@dosumis and the ontology group also want a primary contact or contacts for each group: https://github.com/geneontology/go-site/issues/231
Just checking in to see if any progress here. Anything I can do to help?
https://github.com/geneontology/go-site/issues/231#issuecomment-253906140 The first step will be merging the data. After that, the changes to make sure that the data is picked up in the API, then I'll coordinate with @balhoff to make sure that is round-trips correctly.
@cmungall You told me to poke you about this at Monday meeting. Any last thoughts before I flip a coin? See https://github.com/geneontology/noctua/issues/350#issuecomment-247724934 .
Summary so far:
We should have an answer shortly
Do we want to switch what we currently use to PAV while we're at it?
no, dc is fairly standard, just doesn't have the granularity we need here
@balhoff I've taken a look at the different ways of injecting the group information into the request stream, and I now think the least awkward would be to add it at the request set packet level, rather than at the level of each operation--just like the _uid_s.
The structure and flow would be almost the same as a uid. This would be an optional list of group id strings that would round tripped and applied to the operations almost exactly like the uid, except optional and possibly a cardinality greater than one.
@kltm did you start any Minerva work on this? I did a preliminary look through code using the uid
so I think I see where to start. Don't want to duplicate efforts though.
@balhoff No, I did not. I started on it when I initially thought we'd add it from the client, but having it threaded in on the server side (as is uid) makes a lot more sense and would keep things significantly cleaner.
This is that "hats" concept that floats up occasionally (that I thought was already captured somewhere, but I have been unable to find).
While the ORCID-per-annotation model gets us a long way, individuals work for different entities over time, and at the same time, so there needs to be some method not only capturing who did something, we also have to capture what "hat" they were wearing when they did it.
The noctua bit of this, besides proper message passing, would be a dropdown of available hats for the logged-in user.
Somewhat related to #347, as the same kind of group information is a prerequisite. The group/funding information would likely need to be partially kept in users.yaml, with maybe an ontology or reference IDs for group/funding entities (TBD, and maybe an item for geneontology/go-site).