OBOFoundry / OBOFoundry.github.io

Metadata and website for the Open Bio Ontologies Foundry Ontology Registry
http://obofoundry.org
Other
166 stars 205 forks source link

Add "date added" field to ontologies #1967

Closed cthoyt closed 1 year ago

cthoyt commented 2 years ago

This would be pretty difficult to get for old ontologies, so maybe we have to add a dummy value, but it would be very valuable to have the date added on all new ontologies such that we could apply new, higher quality standards to them in a reproducible programmatic way. This would require a few things

  1. Add a new field to the JSON schema that can either be a date, or some pre-defined flag
  2. Auto-populate existing ontologies with some kind of flag
matentzn commented 2 years ago

Full support.

matentzn commented 2 years ago

Why not pick an arbitrary start date, lake first commit of the md file? Fine by me!

cthoyt commented 2 years ago

Yes that's a great idea, if I were more of a git ninja I would probably have suggested that, but you know i'd rather come with solutions rather than ideas ;)

matentzn commented 2 years ago

🥷

lschriml commented 2 years ago

As we all used SourceForge to begin with, a date from that system would be more precise.

matentzn commented 2 years ago

@lschriml impossible to reconstruct now though. I think the point really is, that is what I at least care about, that we can specify QC checks as mandatory moving forward - without having to wait for old ontologies to implement. We can have higher standards for new ontologies :)

cthoyt commented 2 years ago

i figured it out, demonstration coming soon

matentzn commented 2 years ago

During yesterday's OFOC call we decided to go with the "obo_library_since:" solution that reflects when an ontologies was admitted into the OBO Library, rather than using the github_added suggestion proposed by @cthoyt in #1969. However, there was no agreement how to represent unknown values. Due to our use case, we cannot simply omit the tag from the markdown file, so we need to have a vote on the representation of an unknown join date.

Vote: default date for ontologies where the date of joining OBO foundry is unknown

👍 obo_library_since: 1001-01-01 🎉 obo_library_since: 👀 obo_library_since: NA

This vote is only about the missing value. Where we have this information, we use xsd:date format: 2022-02-01. Voting closes 20th July.

hoganwr commented 2 years ago

[image: image.png]

On Wed, Jul 13, 2022 at 8:52 AM Nico Matentzoglu @.***> wrote:

During yesterday's OFOC call we decided to go with the "obo_library_since:" solution that reflects when an ontologies was admitted into the OBO Library, rather than using the github_added suggestion proposed by @cthoyt https://github.com/cthoyt in #1969 https://github.com/OBOFoundry/OBOFoundry.github.io/pull/1969. However, there was no agreement how to represent unknown values. Due to our use case, we cannot simply omit the tag from the markdown file, so we need to have a vote on the representation of an unknown join date. Vote: default date for ontologies where the date of joining OBO foundry is unknown

👍 obo_library_since: 1001-01-01 🎉 obo_library_since: 👀 obo_library_since: NA

This vote is only about the missing value. Where we have this information, we use xsd:date format: 2022-02-01. Voting closes 20th July.

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1183186037, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55UTLFCQQSMGMPUTIHTVT232FANCNFSM5ZB5XDWA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

matentzn commented 2 years ago

@hoganwr you will have to come to GitHub UI to cast your vote, sorry. Email voting does not work!

hoganwr commented 2 years ago

Sorry, still learning Was not clear at al. Sorry I missed yesterday's meeting.

On Wed, Jul 13, 2022 at 9:06 AM Nico Matentzoglu @.***> wrote:

@hoganwr https://github.com/hoganwr you will have to come to GitHub UI to cast your vote, sorry. Email voting does not work!

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1183200250, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55SZXRO25TDTVNMEBXLVT25MZANCNFSM5ZB5XDWA . You are receiving this because you were mentioned.Message ID: @.***>

cthoyt commented 2 years ago

I'm sorry to see that the original purpose of this issue has been lost - the problem we need to solve is to tag each ontology with a date when it was added to OBO so we can enforce stricter automated checks differentially on new ones (since old ones will be slow to adopt new standards, if ever)

The semantics of whether we use "date added to this git repository" (which can be easily accomplished technically, as https://github.com/OBOFoundry/OBOFoundry.github.io/pull/1969 demonstrates) or "when was this accepted into OBO" which is super difficult because this wasn't tracked and there were previous repositories that got migrated into git. Both solutions would help solve the actual problem I want to solve. Again, the first solution technically easy to accomplish, but it seems to be difficult to communicate. The second solution is effectively intractable, since I don't think anyone wants to do this hard work.

The 100% unacceptable solution would be to make it possible to have a blank field or to make it possible to write NA, since 1) it means we can't validate the contents of this field in a consistent way and 2) new ontology submitters can reverse engineer this and therefore skirt the checks. I guess this means I will vote to put a sentinel value, perhaps the unix epoch, 0000-00-00, or something like that...

matentzn commented 2 years ago

You can:

hoganwr commented 2 years ago

This is about date added and the vote, which explicitly allows you to vote for empty value in the date field as one option, and NA as the value for the date field as another option.

This one's not about actionable identifiers.

On Tue, Jul 19, 2022 at 1:55 PM Nico Matentzoglu @.***> wrote:

You can:

  • add a publication with an actionable ID that relates to something entirely unrelated to your ontolgy
  • add an "evidence of usage" that looks correct at first glance but makes no sense
  • add a fake ORCID that resolves to someone else just because you are too lazy to register for your own
  • add a website that relates to a different project
  • ...there are certainly dozens of ways you can trick CI into believing your metadata is right. That is why there is a human review process, and a human reviewer will not allow empty field or NA to be specified. I think you are overthinking this..

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1189387249, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55WSD5U4BQROHF2FBLDVU3TXLANCNFSM5ZB5XDWA . You are receiving this because you were mentioned.Message ID: @.***>

matentzn commented 2 years ago

Yea, I am illustrating about the 100 ways you can fool the automated schema checker into thinking your metadata was correct, and suggesting @cthoyt worries a bit much here..

hoganwr commented 2 years ago

The vote is about what the automated checker will allow, or I am misunderstanding things greatly.

On Tue, Jul 19, 2022 at 1:59 PM Nico Matentzoglu @.***> wrote:

Yea, I am illustrating about the 100 ways you can fool the automated schema checker into thinking your metadata was correct, and suggesting @cthoyt https://github.com/cthoyt worries a bit much here..

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1189391239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJR55XEFUMXEZHJZY4K7STVU3UH5ANCNFSM5ZB5XDWA . You are receiving this because you were mentioned.Message ID: @.***>

matentzn commented 2 years ago

No, thats right! Its about how we should represent "missing values" in a way that the automated checker will ignore it.

On Wed, 20 Jul 2022 at 17:40, Bill Hogan @.***> wrote:

The vote is about what the automated checker will allow, or I am misunderstanding things greatly.

On Tue, Jul 19, 2022 at 1:59 PM Nico Matentzoglu @.***> wrote:

Yea, I am illustrating about the 100 ways you can fool the automated schema checker into thinking your metadata was correct, and suggesting @cthoyt https://github.com/cthoyt worries a bit much here..

— Reply to this email directly, view it on GitHub < https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1189391239 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AAJR55XEFUMXEZHJZY4K7STVU3UH5ANCNFSM5ZB5XDWA

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1967#issuecomment-1190373564, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABV6HJ2V4WZ5NLAPLBZY37DVVAFWVANCNFSM5ZB5XDWA . You are receiving this because you commented.Message ID: @.***>

cthoyt commented 1 year ago

I'm happy with not doing this, now that we have better ways of applying more strict standards to new ontologies. Thanks everyone for the discussion, and sorry to the people on the OFOC call who spent time on this. I assume that nobody else will pick up doing the ver difficult work of getting the technical side right, but invite anyone who is interested to revive this issue.

matentzn commented 1 year ago

I am kinda sorry to see this issue go, but given our priorities, probably the right call. We can always come back to this, and use "first committed" timestamps to approximate "date joined"! Thanks @cthoyt