Closed mavaylon1 closed 1 year ago
@rly @oruebel I took a different route to what was on nwb-schema and similar to how we test pynwb on hdmf. The way nwb-schema validates is comparing the yaml to a json file, which we do not have. I thought it would be good enough to install hdmf and just test the "common" folder.
I do have a question :
name: Cancel non-latest runs
uses: styfle/cancel-workflow-action@0.11.0
with:
all_but_latest: true
access_token: ${{ github.token }}
I am a little confused about the github.token Further digging, I found that GITHUB_TOKEN is an environment variable that is generated during the workflow for read and write access. (https://devopsjournal.io/blog/2022/01/03/GitHub-Tokens)
I don't fully understand the point of this.
Note I did not go through the check list for the PR and this is still a draft. @oruebel @rly
The token is used to authenticate in the workflow job. Does this mean, we need to give the workflow permission to read the repository or in this case to cancel the workflow job if it is running already.
If I as a third party person (not dev) want to cancel a workflow, I cannot because I don't have permission. Is this just giving the workflow permission to cancel to workflow if it exists as if the workflow was a third party person?
the release notes are not updated since this is a workflow update and no release will be made.
This workflow will checkout the dev branch of hdmf and the specific commit of hdmf-common-schema that is stored in hdmf, and then run the tests in tests/unit/common/
of hdmf. That does not test the current hdmf-common-schema version.
What I was looking for is a little more complicated than I initially imagined. Looking at how we do it in nwb-schema (https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/.github/workflows/validate_schema.yml), we want to run the script validate_hdmf_spec core -m nwb.schema.json
but replacing nwb.schema.json
with a json schema for HDMF common. It looks like we never made that schema. But that file should be pretty much the same as https://github.com/NeurodataWithoutBorders/nwb-schema/blob/dev/nwb.schema.json except that 1) the top level metadata needs to be adjusted for HDMF common, 2) all instances of "neurodata_type" -> "data_type" and 3) rename it to hdmf-common.schema.json. I think that's it. Then because the schema lives in a directory "common" instead of "core", I think the command should be validate_hdmf_spec common -m hdmf-common.schema.json
.
@rly what if I cd into hdmf_common_schema, git checkout PR branch for the schema, then ran the tests
Then when the branch is merged, you have to update the workflow again so that you check out the current main branch of the schema instead of a PR branch.
That is fine to do, but I think independently of whether the dev branch of HDMF is compatible with the current PR or main branch of the schema, it would be good to run validate_hdmf_spec common -m hdmf-common.schema.json
just like what we have in NWB schema, to catch validation issues with the schema that may not be caught by the tests in HDMF.
Then when the branch is merged, you have to update the workflow again so that you check out the current main branch of the schema instead of a PR branch.
That is fine to do, but I think independently of whether the dev branch of HDMF is compatible with the current PR or main branch of the schema, it would be good to run
validate_hdmf_spec common -m hdmf-common.schema.json
just like what we have in NWB schema, to catch validation issues with the schema that may not be caught by the tests in HDMF.
The reason why I went with this approach was because it would return an error if the schema wasn't compatible. With going the json route, does that record every instance where there's an issue? Because on second thought my way would only catch it one at a time. @rly
The compatibility test is good.
With JSON validation, I think it will error out at the first error. But critically, just because the schema is compatible with the current API does not mean it is valid according to the schema language. For example, if we add a new data type to the schema, there will not be corresponding code that uses that data type in the dev branch of the API (so that test will pass), but the new data type definition might be invalid according to the rules of the language. We might then write API code for this new data type only to determine later that we wrote the schema in a way that is technically not valid.
Can you please also implement a workflow that does validate_hdmf_spec common -m hdmf-common.schema.json
?
The compatibility test is good.
With JSON validation, I think it will error out at the first error. But critically, just because the schema is compatible with the current API does not mean it is valid according to the schema language. For example, if we add a new data type to the schema, there will not be corresponding code that uses that data type in the dev branch of the API (so that test will pass), but the new data type definition might be invalid according to the rules of the language. We might then write API code for this new data type only to determine later that we wrote the schema in a way that is technically not valid.
Can you please also implement a workflow that does
validate_hdmf_spec common -m hdmf-common.schema.json
?
I'll do both in this PR.
@rly could you do a review?
Schema validation workflow looks good. Thanks.
The .github/workflows/hdmf_compatibility_schema.yml workflow confuses me. Currently it just checks out HDMF and runs a subset of tests from HDMF. It doesn't use the updated schema in the master branch or a pull request branch. I think it should.
The validate schema workflow was skipped. That's odd. Do you know why that happened?
I removed "push" as a trigger because "pull_request" should trigger on each update to the PR, i.e a commit. I unerstood the trigger push, pull_request, and pull request on forked repos. However, I see now that it actually ignores the pull request trigger on the current repo, hence the need for push.
As for the compatibility workflow, I forgot to add the checkout current branch part. I let that slip that's my bad. The idea is to make sure the changes in the schema on any PR shouldn't break HDMF and if it does then it is because an update needs to be made on HDMF
@rly I believe I made the needed changes! I need to slow down so this can be a quick check process.
Summary of changes
Created new workflow to check for HDMf Dev compatibility.
The workflow will clone the dev branch of HDMF and run the tests for "common".
PR checklist for schema changes
docs/source/conf.py
andcommon/namespace.yaml
to the next version with the suffix "-alpha"docs/source/hdmf_common_release_notes.rst
and/ordocs/source/hdmf_experimental_release_notes.rst