obophenotype / cell-ontology

An ontology of cell types
https://obophenotype.github.io/cell-ontology/
Creative Commons Attribution 4.0 International
145 stars 49 forks source link

Define design patterns using standard format #450

Open cmungall opened 8 years ago

cmungall commented 8 years ago

UPDATED Let's punt on this for now:

cmungall commented 5 years ago

This was auto-generated, can be done as a starting point:

https://github.com/cmungall/owl_patternizer/tree/master/examples/cl

This is probably best done after #533

cc @balhoff @matentzn @dosumis

nicolevasilevsky commented 5 years ago

I will work on this after #533 is closed.

cmungall commented 5 years ago

@addiehl needs to add new classes. I am recommending he just goes ahead and adds classes in Protege for now. But post any questions about patterns here.

Remember to work in PRs

cmungall commented 5 years ago

actually @nicolevasilevsky there is nothing to stop your working on a PR with the yaml for now

nicolevasilevsky commented 5 years ago

Ok! I'll go ahead and work on it then. :)

nicolevasilevsky commented 5 years ago

@cmungall I am doing PRs to your repo (see https://github.com/cmungall/owl_patternizer/pulls), should I move these new patterns over to this repo (CL)?

nicolevasilevsky commented 5 years ago

@cmungall I reviewed all the patterns in your repo and made some minor edits and did PRs.

Should I create additional patterns for CL, or will you do so via your auto-generated method (which is very cool!)

cmungall commented 5 years ago

no, that folder is just for the output of the tool. Use as seed, copy to cell-ontology and edit there

On Mon, Feb 4, 2019 at 2:01 PM Nicole Vasilevsky notifications@github.com wrote:

@cmungall https://github.com/cmungall I am doing PRs to your repo (see https://github.com/cmungall/owl_patternizer/pulls), should I move these new patterns over to this repo (CL)?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-460430719, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGOTG5hFqii0hzTQsXEhZGL38_mTyVks5vKK3AgaJpZM4KhyQw .

tgbugs commented 5 years ago

Let me know if I can do anything to help with the fourth point. A bit more documentation can be found in https://github.com/SciCrunch/NIF-Ontology/blob/master/docs/Neurons.md and https://github.com/tgbugs/pyontutils/blob/master/docs/NeuronLangExample.ipynb.

cmungall commented 4 years ago

@nicolevasilevsky did we make any progress on this?

nicolevasilevsky commented 4 years ago

I haven't worked on this in a while - is it high priority?

Looks like Nico added the templates for the templates to this repo here: https://github.com/obophenotype/cell-ontology/tree/master/src/patterns/dosdp-pattern-workshop

And it looks like I got started on a couple patterns.

I can work on this further, if it is high priority, let me know.

cmungall commented 4 years ago

it looks like these came from owl_patternizer. I would apply judgment when copying these over.

On Tue, Sep 1, 2020 at 4:05 PM Nicole Vasilevsky notifications@github.com wrote:

I haven't worked on this in a while - is it high priority?

Looks like Nico added the templates for the templates to this repo here:

https://github.com/obophenotype/cell-ontology/tree/master/src/patterns/dosdp-pattern-workshop

And it looks like I got started on a couple patterns.

I can work on this further, if it is high priority, let me know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-685179452, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOK3BS3ZBAYTOSHDGUDSDV42XANCNFSM4CUHEQYA .

nicolevasilevsky commented 4 years ago

yes, that is where they came from. Got it - I will review these.

Should these be moved out the dosdp-pattern-workshop folder and into the patterns folder? I don't think we are planning another dosdp workshop at the moment.

cc @matentzn

matentzn commented 4 years ago

dosdp-pattern-workshop was has nothing to do with any workshop :) it just means the patterns are works in progress and should not be used until finalised. So yes, judgement needs to be applied.. We can finalise these in the workshop folder and then move them over to dosdp-patterns when they are ready to be reviewed. I can help!

cmungall commented 4 years ago

The majority of compositional classes outside the immune branch fall into a small number of very simple patterns - $cell-part-of-$uberon, $cell-capable-of-$biological_process, this should not be much work

On Wed, Sep 2, 2020 at 12:24 PM Nico Matentzoglu notifications@github.com wrote:

dosdp-pattern-workshop was has nothing to do with any workshop :) it just means the patterns are works in progress and should not be used until finalised. So yes, judgement needs to be applied.. We can finalise these in the workshop folder and then move them over to dosdp-patterns when they are ready to be reviewed. I can help!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-685947428, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOKF723LO73PJMOBNLTSD2LWVANCNFSM4CUHEQYA .

nicolevasilevsky commented 4 years ago

sounds great, thanks @matentzn

cmungall commented 4 years ago

@nicolevasilevsky - are we using a standard label or GitHub project to track these?

Naming conventions. Looks like we are using camelCase with a leading lowercase. Let's keep doing this for now for consistency, but I suggest later renaming to use snake_case.

nicolevasilevsky commented 4 years ago

No, we don't have a label or GitHub project, but I will create both.

uPheno uses camelCase with a leading lowercase (see examples here). Personally, I think it would be nice to be consistent with uPheno.

I didn't know this was called snake_case!

cmungall commented 4 years ago

don't do both - redundancy is bad, it gets confusing and inconsistent, which to use. You could discuss this on the call tomorrow. Happy with whatever system you all feel is most optimal.

Prefer to be consistent across projects for naming conventions. Personally I think the java style gets unreadable for the number of words we typically need. But whatever you and Nico think best.

On Tue, Sep 15, 2020 at 8:59 AM Nicole Vasilevsky notifications@github.com wrote:

No, we don't have a label or GitHub project, but I will create both.

uPheno uses camelCase with a leading lowercase (see examples here https://github.com/obophenotype/upheno/tree/master/src/patterns/dosdp-dev). Personally, I think it would be nice to be consistent with uPheno.

I didn't know this was called snake_case!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-692811592, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOL6HM55CZJXLH3PUBLSF6FNBANCNFSM4CUHEQYA .

matentzn commented 4 years ago

Looking back now I would have preferred snake_case, but too much effort now. I will consider this as part of a big general review. Lets do camel case for now.

Why would you need a separate repo? Like a place that keeps track of all patterns anywhere? We should definitely use a standard tagging system. Is the purpose to identify all tickets and pull requests that relate to the definition/design of patterns? If so, I would suggest to use either pattern or dosdp. What do you think? Do we need anything more fine grained?

nicolevasilevsky commented 4 years ago

@matentzn should we move these patterns into a different folder called dosdp-patterns?

nicolevasilevsky commented 4 years ago

I created a label called pattern. We have a similarly named label in Mondo

matentzn commented 4 years ago

All actual patterns should be in dosdp-patterns directory. Anything that is not (yet) explicitly intended to be used as a pattern should not.. So yes! When you finalise a pattern, always move it to dosdp-patterns!

cmungall commented 4 years ago

No separate repo. I'm saying for ticket organization don't do BOTH labels AND projects

On Tue, Sep 15, 2020 at 9:55 AM Nico Matentzoglu notifications@github.com wrote:

Looking back now I would have preferred snake_case, but too much effort now. I will consider this as part of a big general review. Lets do camel case for now.

Why would you need a separate repo? Like a place that keeps track of all patterns anywhere? We should definitely use a standard tagging system. Is the purpose to identify all tickets and pull requests that relate to the definition/design of patterns? If so, I would suggest to use either pattern or dosdp. What do you think? Do we need anything more fine grained?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-692843459, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOO3FCD4MSR7FOOVH5TSF6MA7ANCNFSM4CUHEQYA .

cmungall commented 4 years ago

agh, can we have one directory. Use PRs for non-final

On Tue, Sep 15, 2020 at 10:00 AM Nico Matentzoglu notifications@github.com wrote:

All actual patterns should be in dosdp-patterns directory. Anything that is not (yet) explicitly intended to be used as a pattern should not.. So yes! When you finalise a pattern, always move it to dosdp-patterns!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-692846455, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOLJ3BNULOMYFJU2KMDSF6MUTANCNFSM4CUHEQYA .

nicolevasilevsky commented 4 years ago

I'm saying for ticket organization don't do BOTH labels AND projects

got it - we just have a label now. No project.

matentzn commented 4 years ago

Ok gotcha. Then I would favour labels over projects. Projects is more useful for larger complex projects imo..

Not sure why you are so mean to this poor directory of patterns in progress 😄 But if you want to do it right then @nicolevasilevsky, delete all patterns in the in-progress dir; delete the in-progress dir and make draft pull requests for all of them (draft while in draft state, undraft when ready for review).

cmungall commented 4 years ago

don't mean to make busy work... if we already have files in both leave for now, let's just move gradually towards a PR system, or one where we have the metadata in the yaml

On Tue, Sep 15, 2020 at 10:07 AM Nico Matentzoglu notifications@github.com wrote:

Ok gotcha. Then I would favour labels over projects. Projects is more useful for larger complex projects imo..

Not sure why you are so mean to this poor directory of patterns in progress 😄 But if you want to do it right then @nicolevasilevsky https://github.com/nicolevasilevsky, delete all patterns in the in-progress dir; delete the in-progress dir and make draft pull requests for all of them (draft while in draft state, undraft when ready for review).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-692849769, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOL2QLIYC3VZBS3ASWLSF6NLXANCNFSM4CUHEQYA .

matentzn commented 4 years ago

Just kidding :) no problem!

nicolevasilevsky commented 4 years ago

I moved the patterns to PRs, please review: https://github.com/obophenotype/cell-ontology/pull/718

cmungall commented 4 years ago

Nicole, go ahead and merge, I made lots of comments on things that need fixed but easier to do post-merge, does no harm to have duff patterns in for now

@matentzn - what should our strategy be for keeping the derived TSVs up to date - GH actions?

matentzn commented 4 years ago

the simplest thing to do would be to run the matching as part of the release, similar to DOSDP generate like this:

1) we create a new component to cl, components/dosdp-annotations.owl 2) for the make goal generating that component, we run dosdp-query on all patterns over the ontology 3) From the generated tsvs, we generate the tags (sets of annotations like:

<http://purl.obolibrary.org/obo/CL_000000> :dosdp-pattern <http://purl.obolibrary.org/obo/cl/patterns/abnormalCell.yaml>

These tags go into components/dosdp-annotations.owl. components/dosdp-annotations.owl is imported into cl-edit.owl The normal CL release process continues.

What do you think? Good enough?

cmungall commented 4 years ago

Great - let's discuss the modeling in another forum as not specific to CL

are there more standard properties we can use?

https://lov.linkeddata.es/dataset/lov/terms?q=implements

or maybe implement a proper vocabulary for templates?

balhoff commented 4 years ago

One thing I'm unsure about with this workflow is that terms will be annotated automatically with whether they conform to a pattern—but this will not help to know whether a term was specifically intended to implement a pattern, and whether it now does or doesn't.

matentzn commented 4 years ago

Yeah @balhoff i agree! These should be separate tags!

cmungall commented 4 years ago

or provenance on assertion? This is just a standard asserted/inferred distinction as we have on any other axiom.

On Thu, Sep 17, 2020 at 10:17 AM Nico Matentzoglu notifications@github.com wrote:

Yeah @balhoff https://github.com/balhoff i agree! These should be separate tags!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-694376708, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOJ7DLXPVVHYVISZTHTSGJABJANCNFSM4CUHEQYA .

matentzn commented 4 years ago

There are two use cases:

  1. provenance: these should go on the generated statements themselves as axiom annotations.
    1. pattern conformance (this is what Chris refers to as inferred, but the word does not seem right; I prefer conforms, like dc:conformsTo)
    2. pattern generated. This explicitly states that a statement was generated from a pattern (which implies conformance), for that I am liking: https://www.w3.org/ns/prov#wasGeneratedBy
  2. quality control: here I want to be able to quickly do some sparql checks over terms that are tagged in some way with a meta category. For example, I want to say: terms conforming to pattern A should not be a subclass of terms conforming to pattern B. I do not want to deal with annotation assertions to achieve this. I want to use a very general tag here as well; I may use DOSDP to assert this tag, or a simple SPARQL match/update. I want to use this to warn the editor of a term right away that ther term conforms to some "category of things", which they can review. I want to combine tags to complex QC queries. For this, I could use anything; including rdfs:comment; or a bespoke OMO (IAO) property like qcTag or something along these lines.
cmungall commented 4 years ago

On Fri, Sep 18, 2020 at 3:56 AM Nico Matentzoglu notifications@github.com wrote:

There are two use cases:

  1. provenance: these should go on the generated statements themselves as axiom annotations.
    1. pattern conformance (this is what Chris refers to as inferred, but the word does not seem right; I prefer conforms, like dc:conformsTo https://www.dublincore.org/specifications/dublin-core/dcmi-terms/terms/conformsTo/ )
    2. pattern generated. This explicitly states that a statement was generated from a pattern (which implies conformance), for that I am liking: https://www.w3.org/ns/prov#wasGeneratedBy

OK good, I guess there are 3 possibilities

These are not mutually exclusive. I have many cases of INTEND-only, and need to work through these so that they become INTEND+CONFORMS.

Usually GENERATED implies INTEND

All the above can apply at the axiom level. This can be very useful. E.g. if a synonym conforms to DP X yet is intended for DP Y this is a smell something has gotten out of sync.

1.

  1. quality control: here I want to be able to quickly do some sparql checks over terms that are tagged in some way with a meta category. For example, I want to say: terms conforming to pattern A should not be a subclass of terms conforming to pattern B.

good, yes, we need more of these

  1. I do not want to deal with annotation assertions to achieve this.

But this could be useful, e.g. for INTENDS; can it not be pluralistic?

  1. I want to use a very general tag here as well; I may use DOSDP to assert this tag, or a simple SPARQL match. I want to use this to warn the editor of a term right away that ther term conforms to some "category of things", which they can review. I want to combine tags to complex QC queries. For this, I could use anything; including rdfs:comment; or a bespoke OMO (IAO) property like qcTag or something along these lines.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-694801082, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOJBXY5SX5VFX3LK2D3SGM4GVANCNFSM4CUHEQYA .

matentzn commented 4 years ago

Yes that sounds a great way to distinguish stuff. I guess we are all not too worried about bloating our ontologies with this kind of stuff - I am not.

Some further thoughts from a disussion with @dosumis

cmungall commented 4 years ago

ok cool

yes, stuff can always be filtered e.g in basic releases

aside: looks like we'll want axiom on rdfs:label, we need to solve the .obo problem generally

I think it's still useful to know at the term level that this term was created by a dp. Maybe just the usual created_by tags fine? Can also be thought of as axioms on the declaration, which we haven't done before, that could get odd though

On Sat, Sep 19, 2020 at 7:04 AM Nico Matentzoglu notifications@github.com wrote:

Yes that sounds a great way to distinguish stuff. I guess we are all not too worried about bloating our ontologies with this kind of stuff - I am not.

Some further thoughts from a disussion with @dosumis https://github.com/dosumis

  • Do we need a date stamp to indicate when the match was run?
  • Should this 3 level tagging (intendedToconform, conforms, generatedBy) by on axiom level only or also term level? For my purposes, I think axiom level is enough, and I can use something more leightweight on term level to simplify my QC checks.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/obophenotype/cell-ontology/issues/450#issuecomment-695217494, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOMSDJFG6RCUZFUHOC3SGS3ARANCNFSM4CUHEQYA .

cmungall commented 2 years ago

I regenerated:

https://github.com/cmungall/owl_patternizer/tree/master/examples/cl

This fixes the bug where uberon/go etc terms were included in the generalization

see above for caveats. These are autogenerated and to be used as seeds. Use judgment

shawntanzk commented 2 years ago

given discussions here: https://docs.google.com/document/d/1XvMbNvr0FEsdqGhg79BYCYEHSqUxRHMcvhbGizEAht8/edit#bookmark=id.699u5qrobewr tech group will put this on the backburner please place back in the tech board when more discussion on how best to implement this is done