tdwg / vocab

Vocabulary Maintenance Specification Task Group + SDS + VMS
11 stars 6 forks source link

Discussion #16

Closed baskaufs closed 7 years ago

baskaufs commented 9 years ago

This is a fake issue so that there is a way to message the "watchers" of Vocab Github Repo (i.e. the "members" of the Vocabulary Management Task Group).

I have finished going through the VoMaG Report and creating issues that I think are relevant to writing the standards. Here is what I envision happening:

  1. We research and discuss the issues that need to be settled before the standards can be written. If any of you have experience or knowledge about a particular issue, please assign that issue to yourself. In the end, we want some kind of recommendation on how to close each issue - hopefully based on some kind of consensus or clear best practice from elsewhere. If we discover new issues along the way, we add them.
  2. We start with the existing draft Standards Documentation Specification. We modify the things that we don't like and add things that are missing. We end up with a new draft Standards Documentation Standard.
  3. We start with the existing Darwin Core Term Change Policy from the Namespace Policy. We modify the things that aren't currently working and add anything that is missing. We end up with a draft Vocabulary Maintenance Standard.
  4. Discuss and revise drafts. When they are clean, we initiate the review process and go from there.

I'm going to drop off the radar next week for holiday, and am going to try to not do much email. When I'm back online, I'll check on whether anything has moved forward. I will then continue pushing as necessary to try to keep us on track for the timeline we suggested in the charter.

You can comment on this issue if you want to discuss anything general about the process. If you want to discuss a particular issue, add your comment to that particular issue.

Thanks for your interest in the work of the Vocab Task Group and I'm looking forward to your contributions! Steve

baskaufs commented 9 years ago

The core members of the VOCAB Task Group will be holding a Google Hangout at 21:00 UTC on Wednesday, July 15. Other VOCAB Task Group members and interested parties are welcome to join. To obtain the link to join the hangout and to view some notes that have been prepared in advance of the meeting, see the Google Doc: https://docs.google.com/document/d/1GEDlVAHpvFj4RuiwATSy5oIxz5JpgBC84yDyGrZvEB0/edit?usp=sharing . If you have any questions, please contact steve.baskauf@vanderbilt.edu

baskaufs commented 9 years ago

Just a reminder that the VOCAB Task Group Google Hangout is in a little over 24 hours from now. Some notes are at: https://docs.google.com/document/d/1GEDlVAHpvFj4RuiwATSy5oIxz5JpgBC84yDyGrZvEB0/edit?usp=sharing See that doc for local times and the hangout link.

Since I previously commented here, I've worked on two models: a hierarchy model for vocabularies and a version model for any kind of resource. I've been working on fleshing them out at https://github.com/tdwg/vocab/blob/master/hierarchy-model.md and https://github.com/tdwg/vocab/blob/master/version-model.md I've hand-hacked an RDF/Turtle example document at https://github.com/tdwg/vocab/blob/master/code-examples/ontology-vocabulary.ttl and run the competency question SPARQL queries on the RDF triples contained in that document to make sure that they actually work. Steve

baskaufs commented 8 years ago

The core members of the VOCAB Task Group will be holding a Google Hangout at 21:00 UTC on Wednesday, 2016-05-04. Other VOCAB Task Group members and interested parties are welcome to join the call. The purpose of the meeting is to discuss progress and future directions for the draft Standards Documentation Specification.

To join the hangout, use this link https://hangouts.google.com/call/rm6yosa3afgttcnzdxx373tuoqe . I have not yet posted the agenda, but when I do, I will create another comment here with the link. If you have any questions, please contact steve.baskauf@vanderbilt.edu, or post a comment on the Discussion issue #16 (https://github.com/tdwg/vocab/issues/16)

baskaufs commented 8 years ago

For reference purposes, the VOCAB meeting time at 21:00 UTC May 4 corresponds to these times:

US Pacific Daylight 14:00 (UTC -7), May 4 US Central Daylight 16:00 (UTC -5), May 4 US Eastern Daylight 17:00 (UTC -4), May 4 Argentina 18:00 (UTC -3), May 4 Central European Summer Time (UTC +2) 23:00, May 4 Australia Eastern Standard Time (UTC +10) 07:00, May 5

baskaufs commented 8 years ago

The agenda for the 21:00 UTC May 4 VOCAB Task Group Google Hangout is at https://docs.google.com/document/d/1xqNfnAa05My4pqH6ym3sSlasN0aa764LxOo193UWpng/edit?usp=sharing

baskaufs commented 8 years ago

I forgot to note here that I've created documents that serialize an RDF graph for Darwin Core that is consistent with the version and hierarchy model described in the draft documentation specification. Those documents are in the directory https://github.com/tdwg/vocab/tree/master/code-examples/darwin-core . I've created a list of use cases and SPARQL queries that satisfy them at https://github.com/tdwg/vocab/blob/master/documentation-use-cases.md . The triples are loaded into a triplestore, so you can try the queries out yourself or hack them to do other things. The point of this exercise was to show that it was actually possible to implement the recommendations for machine-readable versions of vocabularies that are as large and complex as Darwin Core.

baskaufs commented 8 years ago

OK, here is a follow-up to the May 4 VOCAB Task Group hangout.

The notes from the meeting (as annotations to the agenda) are posted at: https://github.com/tdwg/vocab/blob/master/meeting-notes/meeting-agenda-notes-2016-05-04.pdf

The draft of the Documentation Specification is still at https://github.com/tdwg/vocab/blob/master/documentation-specification.md and it will continue to evolve as edits are made. However, it is somewhat closer to a form suitable for submission because I've edited it as necessary in order to close most of the issues that were blocking completion of a draft for submission. Currently there are four issues blocking completion of the draft (see Issue #27 for the issues that are blocking at any given moment):

19 regarding use of RFC 2119. Joel Sachs volunteered to look into this and get back to us.

32 and #38 which are related to each other and regard licensing policies. Stan Blum volunteered to check with several people about this.

37 regarding acknowledging contributors. We didn't discuss this issue in the call. However, I'm thinking that it may be out of scope to specify in the documentation specification who should be acknowledged as contributors. The purpose of the spec is to say how contributors should be acknowledged, but deciding who should be acknowledged is really a policy decision that doesn't have to be written into a standard. I think it would be appropriate to take this issue to the Executive for consideration, but I'm thinking that from the standpoint of developing this standard I may just close the issue.

There is one more section that I thought about which should be written: explaining how to link "semantic layers" of vocabularies that would extend the basic vocabularies. I know what I want to write and just need to do it.

In summary, assuming we can close the remaining issues in a timely fashion, I don't see any major impediments to wrapping up the draft and setting the next phase of the ratification process into motion. Hopefully that can be accomplished by the end of June at the latest. I'm going to attempt to start writing the Vocabulary Management specification soon, but no promises on how long that will take.

tucotuco commented 8 years ago

This has been in my inbox too long. I concur with closing issue #37, though I too feel that review manager should be cited for their contributions.

On Thu, May 12, 2016 at 12:46 PM, Steve Baskauf notifications@github.com wrote:

OK, here is a follow-up to the May 4 VOCAB Task Group hangout.

The notes from the meeting (as annotations to the agenda) are posted at: https://github.com/tdwg/vocab/blob/master/meeting-notes/meeting-agenda-notes-2016-05-04.pdf

The draft of the Documentation Specification is still at https://github.com/tdwg/vocab/blob/master/documentation-specification.md and it will continue to evolve as edits are made. However, it is somewhat closer to a form suitable for submission because I've edited it as necessary in order to close most of the issues that were blocking completion of a draft for submission. Currently there are four issues blocking completion of the draft (see Issue #27 https://github.com/tdwg/vocab/issues/27 for the issues that are blocking at any given moment):

19 https://github.com/tdwg/vocab/issues/19 regarding use of RFC 2119.

Joel Sachs volunteered to look into this and get back to us.

32 https://github.com/tdwg/vocab/issues/32 and #38

https://github.com/tdwg/vocab/issues/38 which are related to each other and regard licensing policies. Stan Blum volunteered to check with several people about this.

37 https://github.com/tdwg/vocab/issues/37 regarding acknowledging

contributors. We didn't discuss this issue in the call. However, I'm thinking that it may be out of scope to specify in the documentation specification who should be acknowledged as contributors. The purpose of the spec is to say how contributors should be acknowledged, but deciding who should be acknowledged is really a policy decision that doesn't have to be written into a standard. I think it would be appropriate to take this issue to the Executive for consideration, but I'm thinking that from the standpoint of developing this standard I may just close the issue.

There is one more section that I thought about which should be written: explaining how to link "semantic layers" of vocabularies that would extend the basic vocabularies. I know what I want to write and just need to do it.

In summary, assuming we can close the remaining issues in a timely fashion, I don't see any major impediments to wrapping up the draft and setting the next phase of the ratification process into motion. Hopefully that can be accomplished by the end of June at the latest. I'm going to attempt to start writing the Vocabulary Management specification soon, but no promises on how long that will take.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/tdwg/vocab/issues/16#issuecomment-218799036

baskaufs commented 8 years ago

Greetings to stalwart "watchers" of the VOCAB Task Group Issues Tracker, who have just endured another flurry of notifications associated with me closing a number of issues related to the Vocabulary Maintenance Specification. As you have probably gathered, I've finished writing a first draft of the specification, which can be viewed at https://github.com/tdwg/vocab/blob/master/maintenance-specification.md . I have approximately 1.5 months to focus on VOCAB TG matters before I get consumed by real "work" again as the fall semester starts. So I would like to fast-track revisions to this document and the already-completed Standards Documentation Specification in order to get the expert review phase of the Standards Process started as soon as possible.

I would appreciate any comments about either draft from VOCAB TG general members. Limited comments can be posted on the Issues Tracker as a response to this post, or sent directly to me in the form of an email. If anyone wishes to make more extensive comments, contact me and I'll email you the edit link to a Google Doc version of the draft where you can hack away and insert comments at will.

I'll summarize some general points about the draft Vocabulary Maintenance Specification that might be helpful before you read through the draft:

  1. To the maximum extent possible, I've tried to codify existing practices that we've found to work based on our experience with previous changes to Darwin Core. In places where things haven't worked, I've tried to incorporate suggestions from the VoMaG report or solutions that we worked out in previous TG core member hangouts.
  2. The draft clarifies that the primary responsibility for maintaining a vocabulary standard falls to an Interest Group chartered specifically for that purpose. This is in line with the vision of the TDWG Process document, which says that Task Groups go away when their tasks are done (something that hasn't happened with the Darwin Core Task Group) and that Interest Groups maintain created products. It is also a departure from the Darwin Core Namespace Policy, which states that the Technical Architecture Group (TAG) bears a major responsibility in managing the vocabulary change process. The TAG was initially envisioned as an overarching group that would maintain technical compatibility between TDWG standards, but not one that bore the brunt of vocabulary maintenance.
  3. The draft begins by articulating general goals for TDWG vocabularies (facilitating data sharing, reusability, stability, and persistence) and uses those principles inform the specific details of decision-making and process.
  4. The draft provides a detailed description of the term change process, then describes modifications to that process that are required for changes to documents, and addition of vocabulary enhancements.
  5. The draft introduces a new requirement modeled on the IETF and W3C implementation experience requirement for a standard to be fully ratified. Section 4 of the draft requires an implementation report that documents experience with proposed vocabulary enhancements (such as application profiles or "semantic layers") build on the basic "bag of terms". Several Task Group members expressed support for this requirement, so I added it to the draft. However, there is no previous precedent for this requirement in TDWG, so I'm keen to get feedback from the Task Group about that part of the draft, particularly from anyone who has experience with implementation testing. I may remove Section 4 from the draft before submission if there is insufficient support to move forward with that requirement.

I would greatly appreciate any comments that you can provide in the next week. At the end of June, I'll try to evaluate the response to determine if it is necessary to hold another live Google Hangout to hash out unresolved issues, if more time is needed for review, or if the two drafts are sufficiently mature to request the appointment of a review manager to move the drafts on to the next stage of the standards adoption process.

ansell commented 8 years ago

I have some experience with implementation testing for the W3C SPARQL/RDF/JSONLD specifications with the OpenRDF Sesame and JSONLD-Java implementations. That testing was based heavily on published testsuites:

https://github.com/w3c/rdf-tests

https://github.com/json-ld/json-ld.org/tree/master/test-suite/tests

Testsuites are most often used for operation results or transformations between formats, but they can also be useful for testing contract variation for APIs. Each vocabulary is effectively an API and hence could have contract tests executed to verify that changes maintain existing required behaviour.

At another level, simply testing that the machine readable versions of vocabularies are all syntactically correct after changes can be valuable if it is automated, say using TravisCI. I have done some work with that using the sesame-vocab-builder project in the past:

https://github.com/tkurz/sesame-vocab-builder

baskaufs commented 8 years ago

Peter, thanks for the feedback. Based on your experience, do you think that the wording of Section 4 and its subsections adequately describe what should be included in an implementation report?

An even broader question is whether it is a good idea to require an implementation report, and if so, under what circumstances? Both the IETF and W3C have a requirement for implementation experience built into their standards processes: https://www.w3.org/2015/Process-20150901/#rec-pr and https://tools.ietf.org/html/rfc2026#section-4.1.2 In multiple previous discussions of proposed changes to Darwin Core, the need for specifying use cases has come up. So some sort of clarification about how a proposed change would solve some problem and some sort of evidence that the proposed change would actually "work" seems desirable. But should the requirement for an implementation report extend even to changes in individual terms, or just to broader changes such as those that I've tried to describe in Section 4?

ansell commented 8 years ago

There may be some confusion in the current structure and terminology used for Section 4 due to the use of the work Implementation Report to denote an officially created document and not a user-submitted test report as it does at the W3C.

It may be better to relabel Section 4 "User Feedback Report", or something clearer than that, and relocate the procedures that users need to follow to submit their test reports (compliance reports may be a clearer term than test reports) into a separate section. That may make it simpler to distinguish the steps for users to give feedback from the procedures that the TDWG vocabulary editors will follow to compile the aggregated report on how the change affects current implementations.

If the specification is intended to be a living document (ala, HTML-5 for example), then each substantial change, possibly including anything except minor clarifications/changes to match existing practice, should have a user feedback report attached to it. However, to require that, the process has to be as simple as possible for both users and editors to complete or they will not have the time available.

In the case of the W3C with the RDF specs, having the test specifications available in a git repository as machine readable and executable documents makes it simple for implementors to pick up new changes and rerun their testsuites to either verify they still comply, or get an indication of which areas they are failing in. The format for the test manifests is still fairly ad-hoc and could be improved on, but the general concept of a machine readable manifest that contains executable test cases is valuable in my experience:

http://www.w3.org/2001/sw/DataAccess/tests/test-manifest#

The user test reports for the W3C tests are also submitted as machine readable documents, using the EARL/DOAP/FOAF vocabularies so they can be aggregated by automated tools to generate implementation reports, which then could be stored near the manifests, so users could examine what other systems achieved when implementing/testing their own versions, as not every report will indicate 100% compliance.

https://www.w3.org/TR/EARL10-Schema/

https://www.w3.org/ns/earl

http://usefulinc.com/ns/doap#

http://xmlns.com/foaf/0.1/

A change request should also be allowed to deprecate any existing testd, which can be flagged in the manifest without removing the test cases, so the deprecated tests can be made not to fail a testsuite after an update. The details about why the test case is being deprecated would be included along with the change request goals/aims.

In my experience, it isn't valuable for long term specifications to allow performance figures to be included in compliance reports, as they may distract from the core information about whether the system actually includes the feature or not and they are not easy to formalise. However, if you feel it is necessary, there could be a comments section for severe regression cases where users could give feedback about features that couldn't be implemented efficiently.

It isn't strictly necessary for a change request to just refer to a single feature change, but it definitely helps. It is simpler when deciding to approve changes and deciding when to ask for user feedback if the smallest relevant logical unit is the entire subject of a change request. If the suggestions are so broad that they refer to a large number of terms and a large number of existing test cases, then the suggestion is likely too big and may need splitting up. The smallest relevant logical unit doesn't need to be a single vocabulary term, and probably doesn't need to be exactly specified as it is a subjective concept just used to explain to users why their large set of changes need to be split up to be reviewed effectively.

Overall, the goal of user feedback reports is to reduce the frequency of negative reponses to changes after the fact, that could have been identified before approving a change, not to prevent all possible negative responses in future, and the current text seems to be aligned with that goal.

baskaufs commented 8 years ago

I have set up a Doodle Poll to schedule a final (?) Google Hangout for the Task Group sometime between July 22 and July 27. The goal of the meeting is to either recommend moving the draft specifications to review or to figure out what else needs to be done (presuming that I complete revisions by that time). If you have an interest in participating in the meeting, please indicate your preference. The poll link is http://doodle.com/poll/2yhe7crtiykheitx

baskaufs commented 8 years ago

I have closed all of the remaining issues blocking the completion of the draft specifications and completed proofreading and revisions. We have scheduled a time for a Google Hangout for the purpose of deciding whether to recommend advancing the drafts to the review manager stage. The hangout will be Monday, July 25 at 21:00 UTC, which should correspond to:

US Pacific Daylight 14:00, July 25 US Central Daylight 16:00, July 25 US Eastern Daylight 17:00, July 25 Argentina 18:00, July 25 some of Europe 23:00, July 25 Australia Eastern Standard Time 07:00, July 26

Any interested parties are encouraged to participate. I'll publish a link to the Hangout via this thread closer to the time of the meeting, but save the date. You can view the clean drafts at

https://github.com/tdwg/vocab/blob/master/documentation-specification.md

and

https://github.com/tdwg/vocab/blob/master/maintenance-specification.md

If you want to provide feedback on the documents, you can do it on this thread if the comments are succinct. Otherwise, email me and I'll send you the editing links for Google Doc versions of the drafts where you can insert comments and suggest edits. Contact me at steve.baskauf@vanderbilt.edu

If you want to suggest changes, please do so well in advance of the meeting. I'm hoping to have clean, revised versions of the documents several days before the Hangout so that people can look at the "final" versions the day before meeting.

baskaufs commented 7 years ago

This is another reminder of the VOCAB Task Group Google Hangout on Monday, July 25 at 21:00 UTC. The link to the hangout will be https://plus.google.com/hangouts/_/calendar/M2JkMzB0dmlxYXNnYmRvYm9kMWk3MnRybG9AZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ.h69vsakbgk2viomrl2fuq9pk0g?authuser=0 If you want, email me and I can add you to the Google Calendar event so that you can access the video call from the event.

So far, I haven't gotten any comments or suggested edits on the draft documents at

https://github.com/tdwg/vocab/blob/master/documentation-specification.md

and

https://github.com/tdwg/vocab/blob/master/maintenance-specification.md

so at this point I'm going to consider them clean copies for consideration at the meeting. I will begin the meeting by highlighting what I believe to be the important features of the specifications, then open the floor for discussion. Following discussion, we can consider whether there is consensus of the Task Group to submit the drafts to the executive for appointment of a review manager who will initiate the expert reviews. If there is not a consensus to advance the drafts, we will need to enter issues in the tracker to specify what needs to be resolved before the Task Group is ready to recommend advancing the drafts. As a reminder, the meeting time corresponds to these local times:

US Pacific Daylight 14:00, July 25 US Central Daylight 16:00, July 25 US Eastern Daylight 17:00, July 25 Argentina 18:00, July 25 some of Europe 23:00, July 25 Australia Eastern Standard Time 07:00, July 26

Steve steve.baskauf@vanderbilt.edu

xjsachs commented 7 years ago

Below are some comments, questions, and minor suggestions on the Standards Documentation Specification.

1.1 Audience "This document is intended primarily for those who are writing TDWG standards and vocabularies."

Question: This is an inclusive "and", right? Every TDWG vocabulary is part of a TDWG standard?

1.3 Examples in this document "Use of Turtle does not imply that it is a preferred serialization, ... Other serializations such as RDF/XML, RDFa, and JSON-LD might also be used to represent the same information."

Suggestion: Amend to "Other serializations such as RDF/XML, RDFa, JSON-LD, or non-RDF serializations might also be used to represent the same information." This would be explicitly in harmony with Section 4, which states that "The relationships described in this section MAY be expressed as Resource Description Framework (RDF), but that is not to the exclusion of other methods that might be available for expressing the same relationships in a manner that also facilitates machine processing."

1.4 Definitions "distribution - a specific available form of a dataset. In the context of this standard, distributions are available forms of vocabulary term lists, such as downloadable files in RDF/XML or RDF/Turtle. [DCAT]"

Suggestion: Add a couple of human readable formats to the list of examples, e.g. "...such as downloadable files in HTML, markdown, RDF/XML, or RDF/Turtle."

"metadata scheme - a vocabulary used to make assertions about individuals (sensu OWL [OWL-OVERVIEW])."

Question: Would it be equally accurate to say "a vocabulary used to make assertions about resources (see definition of resource below)." ? If so, I suggest changing.

"resource - Any kind of thing that can be identified. Resources can include documents, people, physical objects, and abstract concepts [RDF-PRIMER]."

Suggestion: Add "In TDWG contexts, common resources include observations, specimens, samples, organisms, and taxon concepts".

"user - ... A user can also be a semantic client ..."

Suggestion: Replace the adjective "semantic", with "software" or "machine".

"vocabulary - a collection of standardized terms and their definitions. Terms MAY represent classes, properties, or concepts."

Suggestion: Add "TDWG vocabularies include both metadata schemes and controlled vocabularies."

2.1 Abstract resources and representations "We can consider each of these particular resources as an abstract entity that manifests itself in one or more than concrete representations."

Typo: extra "than". should be "one or more" or "one or more than one".

2.2.2 Descriptive documents "When the IRI of the descriptive document is dereferenced by machine clients, the client SHOULD retrieve a machine-readable description of the document's metadata."

Suggestion: Change "the client SHOULD retrieve" to "the client SHOULD be served".

3.1.2 IRI of the standard The landing page SHOULD include the HTTP IRI that identifies the standard.

Question: Why not "MUST"?

3.3.1 Landing page for the vocabulary "As such, the vocabulary will have an IRI that is distinct from the standard's IRI."

Question: Is this only when a standard defines multiple vocabularies? Or is it also the case that if a standard comprises a single vocabulary, the vocabulary requires its own landing page? Is it okay for all the vocabularies of a standard to share a landing page, as is the case with Audubon Core - http://terms.tdwg.org/wiki/Audubon_Core_Structure ? If so, maybe we should explicitly say so.

3.3.3 Term list documents "Each vocabulary will have at least one term list that contains terms that are defined by the standard that contains it." Question: Is it okay for all vocabularies of a standard to share the same term list? (e.g. http://terms.tdwg.org/wiki/Audubon_Core_Term_List)

baskaufs commented 7 years ago

Thanks Joel for the proofread and suggestions. Response to Joel's comments/questions:

1.1 There might be TDWG vocabularies that aren't part of a standard, but this specification wouldn't apply to them. I changed "and" to "including" in order to clarify.

1.3 Although JSON-LD can be used to serialize RDF, JSON-LD extends RDF, so there can be JSON-LD that isn't valid RDF. I couldn't think of any other current standard that's not strictly RDF that could be used to represent the relationships expressed in the specification other than JSON-LD, so that is the only one I mentioned in the example. But I think it's fine to clarify as Joel suggested, so I made the change.

1.4 (distributions) - made the recommended change.

(metadata scheme) - I wrote this definition after studying the [NISO] and [ISO-25964-2] references. Unfortunately, I had to return the ISO 25964 documents that I'd gotten through Interlibrary Loan, so I can't look at them again now. The gist that I got from reading these documents is that metadata schemes were used to define properties of class instances, or to define the classes of which the resources were instances. In RDF terms, the subjects of triples containing metadata scheme terms would be instances of classes and NOT classes themselves. "Individuals (sensu OWL)" seemed to me to be a more clear and succinct way to say this than "class instances". Because "resources" could include classes as well as instances, "resources" seemed to me to be a broader term than "Individuals", which is why I didn't use "resources". Perhaps I'm making an artificial distinction here, but in the typical use of Darwin and Dublin Cores (the primary examples of metadata scheme that I had in mind) the subject of triples, or the subjects of rows in a table are usually instances of books, specimens, people, etc. and NOT typically the class of books, the class of specimens, the class of people, etc. If I'm off base on my thinking here, this might merit further discussion or a more authoritative definition for metadata scheme (if we can find one).

(user) - changed "semantic client" to "machine client" throughout document.

(vocabularies) - added "TDWG vocabularies can include both metadata schemes and controlled vocabularies." (currently there are no controlled vocabularies that are standards).

2.1 corrected typo

2.2.2 made suggested change.

3.1.2 When I decided that it was appropriate to apply RFC 2119 to this specification, there were many places where I wasn't sure whether to use SHOULD or MUST. My take on "SHOULD" was that implementers should only fail to do what is recommended if they had a really good reason to ignore the recommendation. There were a lot of places where I couldn't think of reasons why a person would ignore the recommendation, but where nothing would really be "broken" if they did ignore the recommendation. In many of those instances I used SHOULD instead of MUST. It's possible that perhaps we should be taking a harder line and use MUST a lot more. I'd be happy to change the example Joel mentioned to "MUST", but if we are taking a harder line, then there probably are a lot of other places that should also be changed. We could talk about this during the hangout. It's possible that we need a second opinion on many of the SHOULD/MUST options.

3.3.1 I think the thing that matters relative to Joel's question about this section is that dereferencing of term IRIs works correctly, and that all of the pieces of a vocabulary and a standard get linked together in a machine-readable way. Typically, a vocabulary term has the property rdfs:isDefinedBy and in this model, the value is the IRI of the term list rather than of the vocabulary itself. That's because in a vocabulary like Darwin Core, there are several term namespaces: dwc:, dwciri:, and dwcattributes:. None of those namespace IRIs by itself represent the "Darwin Core vocabulary" because the DwC vocabulary contains terms from all of those namespaces, plus terms from the dcterms: namespace, which is defined outside of TDWG. So the DwC vocabulary would need to have an IRI that is distinct from any of the namespace IRIs. That DwC vocabulary IRI would also have to be different from the IRI of the Darwin Core Standard, because the Darwin Core standard includes explanatory documents (like the text, XML, and RDF guides) that are distinct from the DwC vocabulary. All of these resources would be linked by dcterms:isPartOf relationships so that a machine could discover all of them and figure out how they are linked. The out-of-date diagram at https://raw.githubusercontent.com/tdwg/vocab/master/tdwg-standards-hierarchy-2016-02-15.png shows the overall structure.

Whether or not each of the various resources (standard, vocabulary, term lists) all dereference to separate documents would be an implementation choice. One could in theory use a bunch of different fragment identifiers to make the IRIs different and have them all dereference to serve the same document. The documentation specification doesn't prescribe any certain recipe, it just says that the IRIs have to be different.

3.3.3 Again, whether a single document is served or several depends on the choice of the vocabulary creators. In the case of Audubon Core, dereferencing the AC terms would require that a page corresponding to the IRI http://rs.tdwg.org/ac/terms/ (i.e. the namespace IRI associated with the abbreviation ac:) be served. That would be the IRI for the term list that defines the AC-minted terms. However, one could use an IRI for the term list of the borrowed terms like: http://rs.tdwg.org/ac/terms#borrowed, and then rig the server to serve the same document. In that case, all of the AC terms could be on a single page as they are now. The documentation specification doesn't say how you have to set up the IRIs, just that there have to be distinct IRIs for each term list that is defined.

baskaufs commented 7 years ago

Hmm. I thought I had pushed the version with the changes Joel suggested, but apparently not. The changes are now in the latest version. I also updated the hierarchy model diagram at https://github.com/tdwg/vocab/blob/master/hierarchy-model.md so that it would be consistent with the examples in the text of the Standards Documentation Specification.

jar398 commented 7 years ago

Sorry for being last minute, just got back from vacation. I think almost all of what I flag below is either a copy edit or a request for clarification. I.e. nothing substantive. With one or two exceptions... Jonathan

Reviewing https://github.com/tdwg/vocab/blob/master/maintenance-specification.md

1.1 'members of Interest Groups that' - should be 'who'

1.3 'that specify that which is necessary to comply with the standard' awk. Maybe 'that specify what is necessary in order to comply with the standard' (or 'what must be done in order to comply', etc.)

would be good to have in-line hyperlink to IRI spec

2.1 'a vocabulary maintenance Interest Group cannot be disbanded unless the vocabulary it maintains is deprecated' seems draconian. IEEE has lots of orphan specs and people use them (e.g. RFC 2119). They are what they are and maybe they don't need active maintenance. If they need a revision, a new group of authors can be convened to make a new spec.

2.3 'occur in a timely fashion' not defined but is implicit in what follows? So either redundant, in which it need not be said, or not, in which case it should be explained how 'in a timely fashion' is different from 'reviewed annually'.

IG can convene a TG, but EC must disband it? That's odd. Or maybe an IG cannot convene a TG?

3.2.2 capitalize 'in the case of equivocal substantive errors,'

3.3.2 'can be refined' - who can refine it? submitted? as explained in next sentence? A bit confusing. 'An official change' - what makes a change 'official'?

maybe change 'providing closure' to obtaining closure'

3.3.3 'the decision should be reported' seems odd in conjunction with 'no decision will be recorded in the decision'. Maybe that's what you mean, in which case leave it, but it seems weird to have a decision history that does not record a decision.

3.3.4.1 So individual terms are versioned, not vocabularies, right? Does that mean each term is its own Standards Document (with respect to the process)?

I think maybe you mean that term documentation is part of some vocabulary specification, and that the vocab spec has to be modified according to DOC-SPEC? But which level is invoked at each point in the process is pretty muddy to me (on a first reading).

3.4.3 'changes to normative content is' number agreement

Worried about burden this process puts on EC, but not sure what to do about it. Maybe EC will rubber-stamp most submissions.

4.2 first use of the phrase 'change proposal'. - explain?

4.1, 4.3 'place restrictions' is a one-sided and negative-sounding view; every restriction on the part of the sender/generator corresponds to an opportunity, or freedom, on the part of the receiver/interpreter (they know something they wouldn't have known in the absence of the 'restriction'). Not sure what to suggest instead; maybe 'constraint' is a bit friendlier than 'restriction'.

On Sun, Jul 24, 2016 at 5:39 PM, Steve Baskauf notifications@github.com wrote:

Hmm. I thought I had pushed the version with the changes Joel suggested, but apparently not. The changes are now in the latest version. I also updated the hierarchy model diagram at https://github.com/tdwg/vocab/blob/master/hierarchy-model.md so that it would be consistent with the examples in the text of the Standards Documentation Specification.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/vocab/issues/16#issuecomment-234804107, or mute the thread https://github.com/notifications/unsubscribe-auth/AB8QkrlKcxo4IPVFJyQsNdyPfaHJoUGMks5qY9uAgaJpZM4E3hay .

baskaufs commented 7 years ago

This is a reminder of the VOCAB Task Group Google Hangout that will occur in approximately 1.5 hours. Anyone interested parties are invited to attend - the meeting is not restricted to core members. The hangout link is https://hangouts.google.com/call/zulz5dj3wjf4pjzcwczddczpr4e and the link given on the Google Calendar event leads to the same hangout.

I will be briefly highlighting what I think are the important highlights of the two specifications. The points are listed in this Google Doc: https://docs.google.com/document/d/1gRnsnQGB4kvEyeb6TAfyneuhWd_jtG95XGuQNKAgwF0/edit?usp=sharing As I said previously, the point of the Hangout is to determine whether there is a consensus to move the drafts forward to review. So if you have an opinion, please join us and express it.

I have enabled editing for those with the link so that anyone attending the Hangout can annotated. After the Hangout is over, I'll turn editing off so that it can't get spammed.

baskaufs commented 7 years ago

Thanks, Jonathan!

Given the nearness of the TG Hangout, I'm not going to mess with the drafts right now in response to Jonathan's comments. I'll make the typo edits after the Hangout. I'll try to respond to the more substantive comments/questions here and if more discussion is necessary, Jonathan can bring up those points during the Hangout.

2.1 'a vocabulary maintenance Interest Group cannot be disbanded unless the vocabulary it maintains is deprecated' seems draconian. IEEE has lots of orphan specs and people use them (e.g. RFC 2119). They are what they are and maybe they don't need active maintenance. If they need a revision, a new group of authors can be convened to make a new spec. Maybe we should talk about this one. The problem that TDWG has had in the past was half-finished or unmaintained work, e.g. the TDWG Ontology, the Standards Documentation Specification, NCD draft standard. My thinking was that if a vocabulary was a "living" thing, then somebody should tend it and if a vocabulary (or standard in general) was dead, then people should be told that by a deprecation notification. I wasn't really thinking about the case described here. Perhaps there needs to be some status between "actively maintained" and "deprecated" (i.e. not recommended for use).

2.3 'occur in a timely fashion' not defined but is implicit in what follows? So either redundant, in which it need not be said, or not, in which case it should be explained how 'in a timely fashion' is different from 'reviewed annually'. I think drop the "in a timely fashion" part.

IG can convene a TG, but EC must disband it? That's odd. Or maybe an IG cannot convene a TG? I tried to write this based on the existing TDWG Process document: http://www.tdwg.org/about-tdwg/process/ . Conveners of parent Interest Groups submit TG charters, but the Executive Committee approves the charter and announces the formation of the TG. The only avenue in the Process document for getting rid of a TG is upon the rejection of a TG report by the Executive. This may not make sense, but I tried to write the specification to be consistent with the Process document, since changing the process would require changing the TDWG Process and I think that would require an Executive decision and maybe even a vote of the membership. That was a can of worms I didn't want to open, since it would pretty much guarantee that the specification would never get finished.

3.3.2 'can be refined' - who can refine it? submitted? as explained in next sentence? A bit confusing. 'An official change' - what makes a change 'official'? This is mostly a cut-and-paste from the existing DwC Namespace policy. Suggested change in wording would be welcome. The "official change" probably should be changed to "changes in the official proposal" or something like that.

3.3.3 'the decision should be reported' seems odd in conjunction with 'no decision will be recorded in the decision'. Maybe that's what you mean, in which case leave it, but it seems weird to have a decision history that does not record a decision. In the third case, the proposal is not resolved and is sent back to the public review stage. Ultimately, there would be either acceptance or final rejection that would be recorded. I guess it wouldn't hurt to record a decision in this third case. I think that decision http://rs.tdwg.org/dwc/terms/history/decisions/#Decision-2011-10-16_3 is something like what Jonathon describes. We could talk about this one.

3.3.4.1 So individual terms are versioned, not vocabularies, right? Does that mean each term is its own Standards Document (with respect to the process)?

I think maybe you mean that term documentation is part of some vocabulary specification, and that the vocab spec has to be modified according to DOC-SPEC? But which level is invoked at each point in the process is pretty muddy to me (on a first reading). Based on the precedent of DwC, individual terms are versioned. They aren't individual documents, but part of term lists according to the hierarchy model. Vocabularies can also be versioned in accordance with the general version model in the specification. However, the conditions under which changing to a new version of a term would trigger the change in the version of the containing the term are not clear to me. We talked about this in Issue 40, but my take on that was that it was too restrictive on implementers for the specification to dictate how the versioning of vocabularies was managed. It's possible that for efficiency's sake, new releases/versions of the vocabulary might not be released if it was felt that additional changes were likely to be approved soon. If this is too unclear to move forward, then perhaps a re-write of the section would be in order before submission. However, I'm not 100% sure I'd know how to write it at this point.

4.2 first use of the phrase 'change proposal'. - explain? "change proposal" wasn't intended as a technical term here, just a proposal for a change that was submitted as a single issue in the tracker.

4.1, 4.3 'place restrictions' is a one-sided and negative-sounding view; every restriction on the part of the sender/generator corresponds to an opportunity, or freedom, on the part of the receiver/interpreter (they know something they wouldn't have known in the absence of the 'restriction'). Not sure what to suggest instead; maybe 'constraint' is a bit friendlier than 'restriction'. This could easily be changed to "constraints" if that sounds better. I was thinking of OWL cardinality restrictions, which apparently are also called cardinality constraints. I wasn't really thinking about there being much of a difference between the two. The other thing I had in mind was specifying datatypes in an application profile, e.g. specifying that dcterms:modified should be datatyped as xsd:dateTime or something like that. It's more restrictive on the data provider/generator - I wasn't thinking about it in terms of the receiver/consumer.

baskaufs commented 7 years ago

OK, I've made changes to the Vocabulary Maintenance Specification based on Jonathan's comments. I won't say more about the cosmetic ones or the ones where I changed wording to improve clarity - you can just look at the draft.

The main issue that I had to think harder about was the one about section 2.1. It does seem like a vocabulary could achieve a level of stability where it no longer needed to be maintained. I re-wrote parts of that section to say that the Executive could disband the IG if the vocabulary seemed to be stable and didn't need maintenance. I also looked at the Status designations for TDWG Standards. There is a category called "Retired Standard" and I referenced that as a reason why an IG might be disbanded. You can see what I wrote and reply here if you have comments.

I ended up not making any changes as a result of the comment about 3.3.4.1. I went back to the DwC Namespace Policy document to see what it said. It talked about version changes for terms and I just carried that language over into the spec - it didn't seem to confuse people in the DwC Namespace Policy, so it probably won't confuse people here. I could have talked more about the hierarchy model, but decided that referencing the Documentation spec was enough. People who want to understand versioning and the place of terms within the vocabulary hierarchy can find out by referring to the Doc spec. There is also the possibility that the Documentation spec will be changed and if a lot of its details are referenced in the Maintenance spec, the Maintenance spec would have to be changed as well.

If anyone wants to suggest any additional edits, please do so by Friday, July 29. You can put your comments in this thread or raise a new issue in the tracker. I also am going to generate a Google Doc version that can be edited/commented on because some of the Hangout attendees said they wanted that. To avoid spamming, I'm not going to post the edit link here, but anyone who wasn't in the Hangout wants to edit, email me and I'll send you the link.

baskaufs commented 7 years ago

This is a reminder that if you want to suggest minor edits to the Vocabulary Maintenance and Standards Documentation specifications, please do so by the end of the day today (Friday, July 29).

xjsachs commented 7 years ago

On the question of the definition of "metadata scheme" (section 1.4), Steve wrote: "Perhaps I'm making an artificial distinction here, but in the typical use of Darwin and Dublin Cores (the primary examples of metadata scheme that I had in mind) the subject of triples, or the subjects of rows in a table are usually instances of books, specimens, people, etc. and NOT typically the class of books, the class of specimens, the class of people, etc"

I was thinking of taxa, which are often modelled as classes, and which might be the subject of standardized descriptions.

baskaufs commented 7 years ago

It's true that there are kinds of resources where it could make sense to model a resource as either a class or an instance of a class. In the existing document, the guidelines for defining terms are based on existing examples that we have within TDWG (primarily in Darwin Core) where the data modeled as a large number of instances. It isn't clear to me how one would write guidelines for documenting vocabularies that include many classes rather than many instances. Obviously people do it (e.g. writing OWL EL-type ontologies), but we don't really have examples in the form of current TDWG standards to go on. Somebody may need to work on that in a future revision or separate standard.

baskaufs commented 7 years ago

I am going to draft a letter to the Executive requesting the appointment of a review editor for the two specifications. I won't send it for a day or two in case someone finds some really serious problem with the draft specifications. Otherwise I'm considering them finalized before submission. Please contact me if there are any critical issues that I've missed.

tucotuco commented 7 years ago

You have my full support to submit them, and celebrate.

On Mon, Aug 1, 2016 at 1:15 AM, Steve Baskauf notifications@github.com wrote:

I am going to draft a letter to the Executive requesting the appointment of a review editor for the two specifications. I won't send it for a day or two in case someone finds some really serious problem with the draft specifications. Otherwise I'm considering them finalized before submission. Please contact me if there are any critical issues that I've missed.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/tdwg/vocab/issues/16#issuecomment-236462780, or mute the thread https://github.com/notifications/unsubscribe-auth/AAcP6-w-7ePEeoQ_51T-KspbC2DuQOo2ks5qbSx6gaJpZM4E3hay .

baskaufs commented 7 years ago

Drafts have been submitted for consideration of appointment of review manager, so am closing this issue. If a review manager is appointed, he or she will be running the show from here on out, so I'm closing this thread. Thanks for all of the input and help!