Closed psprings closed 3 years ago
@dubsnipe I believe you generated that yaml, can you fix it? Agree that we should be more resilient to these.
I believe we are now more resilient to invalid manifest files as I've added a number of safe-guards and fallbacks. Feel free to re-open another issue if you still see some blocking failures or any other problems.
Hi, sorry for not looking at this earlier. I understand it is resolved, but do you know what was wrong with this manifest? In some cases, I believe that it has to do more with the filename (this one contained a parenthesis) than the actual content. Right now we're re-doing all of our manifests and automating the process so we'll be dealing with different issues probably.
My honest recommendation: we move the standard to JSON as quickly as possible. YAML is too error prone for this use-case. We can try out JSON in the meantime. I made JSON manifests for Field Ready as quick test already: https://field-ready-projects.openknowhow.org/manifests/list.json. If you want to switch Appropedia to JSON as well I'm willing to read them as JSON for this search.
Ultimately, I think this is less of an issue of JSON vs YAML and more of an issue of validation. If you have a list of externally stored manifests expressed as URLs then there is a risk of mutability. At any given time there could be a change in any one of those files which breaks the build for everyone. I think you could iterate over that list and choose to exclude any "invalid" manifests from the build process.
We are keeping a few OKH files in a repository and created a validator to make sure that invalid content did not get introduced to our repository. There is a corollary GitHub Action that we use to make sure that this gets run and can be enforced differently on a PR vs merge request.
# more GitHub Action metadata here
steps:
# insert additional steps here
- name: open knowledge framework validation (pull request)
uses: helpfulengineering/ok-validate@v0.1.0
if: ${{ github.event_name == 'pull_request' && steps.has-yaml-files.outputs.num-yaml-files != '0' }}
with:
file-restrictions: ${{ steps.changed-files.outputs.all }}
path: ${{ matrix.release }}/
- name: open knowledge framework validation
uses: helpfulengineering/ok-validate@v0.1.0
if: ${{ github.event_name != 'pull_request' && steps.has-yaml-files.outputs.num-yaml-files != '0' }}
with:
path: ${{ matrix.release }}/
@psprings That's a great solution for people on Github and to be clear: I did make the build process for this search website more resilient to invalid manifests so I think this issue should remain closed (hence I'm closing this again @dubsnipe).
Nevertheless I think we will save ourselves a lot of trouble if we move the standard over to a less error-prone format. Even then, errors will still occur of course, and how we report back manifest errors as we flesh out a proper indexing and search is an important question that's worth exploring.
Problem
https://github.com/OpenKnowHow/okh-search/pull/42 is failing because it there is an invalid YAML file present in the
projects_okhs.csv
file which cannot be parsed. This would presumably impact any additional PRs as they would fail the status check.Details
Failing build
Invalid file
https://github.com/OpenKnowHow/okh-search/blob/1e6d73f710f483b28bb7d3bdfd5f093df6ebf5fc/projects_okhs.csv#L198
Should be indented on Line 16
Considerations
In lieu of content-based addressing, a YAML file with a given URL could be mutated without being tested. Those mutations could introduce an invalid YAML file or a file that does not otherwise conform to the OKH standard.
Perhaps in the interim it could make sense to skip invalid YAML files and send a warning to the submitter of that file in the event that the file has changed since it was introduced to the search CSV.