Open mdemierre opened 3 years ago
Adding the ability to patch bundle owned documents is an interesting idea. It's attractive because (minimally) we could just relax the path check on the data API and suddenly it would work! On the other hand, I'm a bit concerned about the side effects (no pun intended). E.g., the revision ID on decision logs would be meaningless (or worse, harmful).
I have to wonder, if you're prepared to write code that sends PUT/PATCH/DELETE calls to OPA, have you considered extending OPA with a custom plugin? With a custom plugin you could extend OPA to read updates from a data source like Kafka and then apply them to the in-memory store. You could still use bundles for distribution of policy and static data but dynamic data could be sourced from elsewhere. I've heard of a few folks that have tried this out with success.
Alternatively, #1055 would provide an OPA-native solution to this. Let's say we implemented the changes in OPA to support deltas and push updates; do you still feel like implementation of the server-side would be more work than what you've proposed?
Hi @tsandall thanks for the quick reply.
I understand the Revision ID issue: there would be no way to know what state OPA really was in since PUTs might have been done in the meantime.
In our case the revision is actually not included in the manifest. Maybe this could be a requirement (that the bundle has no revision)? But it's not elegant I agree. Also, if we use PUT we already don't have the revision feature. One way could also be to add an optional revision number in the PUT.
Custom plugin
We didn't consider going the custom plugin route yet. The implementation you suggested is very elegant. In fact we used this exact pattern of reading changes for another project (not OPA related, but also for policy enforcement based on dynamic data). The Kafka topic was compacted and contained the latest version of each key. It worked quite well.
The client (OPA plugin in this case) would:
In terms of implementation and maintenance effort we are quite constrained, and I would assume switching to PUSH only would take less time (with the drawbacks I mentioned). The plugin implementation has the following challenges:
On the other hand, the custom plugin would be something I'd personally love to implement, and we have a Kafka cluster at our disposal. We'll think about it.
Feature #1055
It would solve the same problem. I think the implementation on the bundle server side is a bit more complex with the constant connection and creation bundle deltas, but the need would be met. Probably the difficulty is more in making this constant connection work with web frameworks than anything else. Some don't support this kind of pattern really well.
It's actually very similar to the Kafka-based solution, except with a different transport mechanism (more coupled). It's even more similar to how many Kubernetes components work (with Watch API).
@tsandall Is the final decision on this to use custom plugin or wait for #1055?
@mdemierre sorry for the delayed reply...
Yes, the recommendation for the time being would be to implement a custom plugin. We can keep this open for now in case other folks have a need to patch/modify bundle owned data.
I stumbled across this issue the other day - we were using the Data API to change the contents of the data.json
file in our root directory. In v0.29.4 everything worked but the bug (feature) was closed in v0.30.0.
The solution was to use the --watch
flag and list the data.json
file explicitly; then we were able to edit the data.json
file in the root directory and OPA automatically reloaded the changes.
Documenting here in case others have a similar work flow.
This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.
Our use case
In our OPA use case, we have the following situation:
We initially implemented the Bundle API with the bundle being generated on-demand. It works well because when OPA starts it contacts the external service and downloads the full data to initialize.
Then, when there is a dynamic change of the data, we wanted to use the Data API to replace/patch/delete the relevant subdocument. However, this is not allowed by the Data API since the data is owned by the downloaded bundle.
The documentation is not very clear about this, as it's not mentioned in the Data API docs, but only hinted at in https://www.openpolicyagent.org/docs/latest/management/#bundles:
The "by default" wording seems to imply there is a way to change this. Also the section it refers to doesn't mention the Data API.
With the current setup, the only way to do it we found is to stop using the Bundle API and use the Data API to push the whole data periodically (or when OPA is restarted) and push the small diffs when needed.
However, this inverts the relationship between the external service and OPA. Now:
This is not ideal either, as we run in a dynamic environment (CloudFoundry) and services change.
Feature request
We would see the ability to use the Data API to push changes to Bundle-owned documents as a elegant solution to this problem:
In fact if I'm not wrong it works when not doing multi-bundle: if I point OPA to a directory with policies and data, I can push changes to this data. It seems the behavior I described is specific to bundles downloaded from bundle servers.
This could be an option set in the "bundles" part of the config if it's desired to prevent such updates by default.
Questions
Related issues
1055: would allow the same kind of reactive updates but has not seen any development yet and is quite a major change.