instructlab / schema

JSON schema for Taxonomy YAML
Apache License 2.0
0 stars 7 forks source link

Use schema to validate document.commit in knowledge is a 40 char hexadecimal SHA value #30

Open bjhargrave opened 2 weeks ago

bjhargrave commented 2 weeks ago

In the taxonomy repo, there is discussion about further validation of the document information in a knowledge contribution. One thing the schema can do is validate that the document.commit value is a 40 char hexadecimal SHA value.

See https://json-schema.org/understanding-json-schema/reference/string#regexp

bjhargrave commented 2 weeks ago

We could required the commit value to always be 40 chars. This would be an issue for some existing knowledge documents.

{  
  "type": "string",
  "pattern": "^[a-fA-F0-9]{40}$"
}

Or we could allow it to be between 7 and 40 chars. This should validate any existing knowledge documents.

{  
  "type": "string",
  "pattern": "^[a-fA-F0-9]{7,40}$"
}