COMCIFS / cif_core

The IUCr CIF core dictionary
15 stars 9 forks source link

Ideas for contributing guidelines #142

Open vaitkus opened 5 years ago

vaitkus commented 5 years ago

This issue is somewhat related to issue #141, however, it tackles the same problems from the point of the contributor. Here are several suggestion that could be included in GitHub contributing guidelines for the repo.

Question 1: how should changes made to the dictionaries be described?

There are several mechanisms in the DDLm that are used to track and register the dictionary change history: 1) The DICTIONARY_AUDIT loop registers the new dictionary version number, date of the version release and provides a short human-readable description of the changes that have been applied; 2) The _dictionary.date data item contains the date of the last change applied to the dictionary. In stable dictionary releases this date should match the date of the latest version release in the DICTIONARY_AUDIT loop; 3) The _definition.update data item is provided in each definition and contains the date of the last change applied to this individual entry.

When presenting a PR the following changes should be applied by the contributor: 1) Any non-trivial change to a definition must be registered using the _definition.update data item by changing its value to the current date; 2) Any non-trivial change to a dictionary must be registered using the _dictionary.date data item by changing its value to the current date; 3) Any non-trivial change to a dictionary must be registered using the DICTIONARY_AUDIT data loop. The registration is carried out by modifying the latest loop entry rather than adding a new loop entry. The latest _dictionary.date data item value should be changed to the current date and a short description of the changes should be appended to the latest value of the _dictionary_audit.revision data item. The version number must remain unchanged.

NOTE: in case proposals from issue #141 get accepted the latest _dictionary_audit.revision value would simply contain the CHANGELOG of the developmental dictionary version.

jamesrhester commented 5 years ago

Regarding point (3), I think it is unnecessary for each contribution to change dictionary_audit.revision. Assuming #141 is accepted, I would prefer that changes to _dictionary_audit.revision occurred when a new version is released. The Github history will contain all changes, and a nice summary can be created from this when the new version is released. Until that time, we can put placeholder text in the final entry to say "refer to the git history".

vaitkus commented 5 years ago

I had two main arguments in mind when suggesting point (3) which were mainly inspired by the recommendations of keeping a CHANGELOG of ones software:

  1. It is easier to describe the changes immediately than after a undefined period of time. I have personally experienced that with software projects it is much easier to run a constant CHANGELOG as the changes happen than to comb through the logs just prior to a release. Factors like people forgetting things, people not be available for additional comments and the sheer scope of log messages to go through make sure of that. Of course, a dictionary is not a piece of software, but I'd image that the same principles would apply.
  2. An explicit change list makes it clear what changes are to be expected. It is somewhat similar to argument (1), although, from a perspective of a potential user/tester of the unreleased dictionary version. Imagine a person trying out an unreleased dictionary and discovering some changes that seem extremely strange. The person might have enough knowledge to identify the potential problem, but might not have enough skill/will/time to go through the logs and check is the change was intentional. Having all of the important changes in a single revision message seems like a more useful alternative. And if at any point the revision message becomes to cumbersome, it can always be trimmed down since it is the dictionary is formally still in development.

However, I do understand that there might be some specifics of managing an ontology (low commit count, low change rate, etc.) that transform the constant log keeping to more of a busywork than a benefit.

jamesrhester commented 3 years ago

Current practice seems to be developing towards documenting all non-trivial changes in the dictionary_audit log, as @vaitkus suggested.

I have some plans to create a proper procedure in COMCIFS for document creation, with a technical committee for technical procedures such as this one. Meanwhile, if @vaitkus would like to draft a document covering the above in comcifs.github.io/draft, that would be very helpful. I note that github actions such as the one proposed in #228 could help contributors find overlooked problems, such as no dates updated relative to the master branch or dictionary_audit not updated. Of course, even if such checks failed they could be ignored for trivial updates.