The Conventions document could be made clearer by removing ambiguities around certain words. BCP 14 handles this in a way that is simple and clear. It's straightforward to adopt BCP 14 or to be inspired by it in such a way that users profit, similarly to how we've been inspired by Semantic Versioning without adopting it wholesale.
We believe that this can be implemented by mid-2025. As soon as we've implemented, all future pull requests will profit. We expect it will first be merged in CF-1.13.
If you want to work on this, please self-assign or ping me - that'll help us keep track. These people have participated in the discussions up till now (I may be forgetting someone, sorry!):
@mraspaud @davidhassell @JonathanGregory @larsbarring @cofinoa @feggleton @DocOtak
I will keep this issue up to date as multiple PRs will likely be required in order to implement this.
Steps to complete
[ ] @larsbarring will post a version of the Conventions with annotations on the BCP 14 controlled vocab as well as "extended vocab" that we should consider rewording to match BCP 14. In the hackathon we also discussed augmenting the extended vocab with "Suggest, allow, permit, forbid, prohibit".
[ ] We decide whether we want to adopt BCP 14 or simply get inspired by it. The main question at the moment is whether we want to use all caps on controlled vocab, as is REQUIRED by BCP 14. Some people like that, others aren't so sure, we should look and see how we like it.
[ ] We pen a text stating how we are using BCP 14. Are we using it wholesale? Do we extend it to additional words? Do we use it without uppercasing? @feggleton has expressed interest in contributing to this. Then potentially in parallel:
[ ] We divide up the Conventions document and check the occurrences of the controlled vocab, rewording if necessary. Probably it makes sense to gather a coalition of the willing and work in parallel, merging into a single branch. Currently there are ~1k occurrences so this is a tractable problem as long as we don't allow CF to be rewritten several times by AIs.
[ ] We develop a pre-merge action to check for use of controlled vocab and highlight that, asking the user to confirm that we're using any introduced controlled terms consistently.
The pre-merge action could be something like (very draft):
#!/bin/bash
# Are we on a pull request?
if [ -z "$GITHUB_HEAD_REF" ]; then
echo "This script is meant to run on a pull request."
exit 1
fi
TARGET_BRANCH=${GITHUB_BASE_REF:-main}
diff_output=$(git diff origin/"$TARGET_BRANCH"... --unified=0 --name-only)
for file in $diff_output; do
# Get added lines in each file
added_lines=$(git diff origin/"$TARGET_BRANCH"... --unified=0 "$file" | grep -E '^\+' | grep -vE '^\+\+\+')
# Search for controlled vocab
if echo "$added_lines" | grep -iE "$vocab"; then
vocab_found=1
echo "Controlled vocabulary found in $file:"
echo "$added_lines" | grep -iE "$vocab"
echo
fi
if [ -n "$vocab_found" ]; then
echo "Controlled vocabulary was found in your changes."
echo "Please verify that these words are used in line with the guidelines set forth in:"
# Would need to make this link point to the right section which doesn't exist yet!
echo "https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#_overview"
exit 1
else
echo "No controlled vocabulary found in added lines."
fi
The Conventions document could be made clearer by removing ambiguities around certain words. BCP 14 handles this in a way that is simple and clear. It's straightforward to adopt BCP 14 or to be inspired by it in such a way that users profit, similarly to how we've been inspired by Semantic Versioning without adopting it wholesale.
We believe that this can be implemented by mid-2025. As soon as we've implemented, all future pull requests will profit. We expect it will first be merged in CF-1.13.
If you want to work on this, please self-assign or ping me - that'll help us keep track. These people have participated in the discussions up till now (I may be forgetting someone, sorry!): @mraspaud @davidhassell @JonathanGregory @larsbarring @cofinoa @feggleton @DocOtak
I will keep this issue up to date as multiple PRs will likely be required in order to implement this.
Steps to complete
The pre-merge action could be something like (very draft):