Closed samson4649 closed 2 months ago
Hello ! I love this idea and also wanted to work on this. I tried to make a proposal doc to describe this feature but didn't release it. As discussed in #492 it's not easy. We want to keep config as code principle as much as possible, but if the ACLs are in DB it's not the case. Also, the API modifying files on system is not ideal and would not work on k8sā¦
I've implemented a working update mechanism however im waiting merge for pr #581 (issue #492)
~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:37Z DBG Setting timeout timeout=5000
2022-06-20T07:11:37Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
"groups": {
"group:admin": [
"host-1"
],
"group:boss": [
"subnet-1"
]
},
"hosts": {
"host-1": "10.64.0.2/32",
"subnet-1": "10.64.0.0/10"
},
"acls": [
{
"action": "accept",
"src": [
"boss"
],
"dst": [
"boss:*"
]
},
{
"action": "accept",
"src": [
"admin"
],
"dst": [
"dev1:*"
]
}
]
}
~ # ./headscale acl update --file /etc/headscale/acl_2.hujson
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:43Z DBG Setting timeout timeout=5000
2022-06-20T07:11:43Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:47Z DBG Setting timeout timeout=5000
2022-06-20T07:11:47Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
"hosts": {
"host-1": "10.64.0.2/32",
"lol-net": "69.69.69.0/24",
"subnet-1": "10.64.0.0/10"
},
"acls": [
{
"action": "accept"
}
]
}
Any word on what needs to happen to progress now that 16.0 is out? I'm waiting on API access to progress work on an ACL management interface.
I think the PR #581 is ready to be merged. Someon needs to take the update ACL subject. If I find some time I'll try to tackle it. I still have some issues on the ACL's that I need to handle. Time is hard to find. If someone else want to try on this please do !
While this is making good progress with a proposal, I will push it to 0.19.0 as a target, we have a lot of stuff for 0.17.0 and want to do a code reorg for 0.18.0.
Is there any progress on this?
There is currently no planned work to implement this.
Might happen in the future, but we have a lot planned for the time being.
This issue is stale because it has been open for 180 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
At the end did someone merged this? Or still waiting after 1.5 years?
At the end did someone merged this? Or still waiting after 1.5 years?
Nope.
I believe we can write a much simpler implementation by following what Tailscale is doing. https://github.com/tailscale/tailscale/blob/main/api.md#policy-file
The APIs can directly deal with the policy JSON structure and expect the user to fetch and modify it. All we will need to do is store is as jsonb
in the DB somewhere and that's it.
If no one is working on this, I'd like to give this a try since it'd be a pretty useful feature for many.
@juanfont I think this issue was automatically closed by the bot by mistake. It seems to me like it should be re-opened since there is still interest in this feature and it looks like @pallabpain wants to take a swing at it. Thanks!
I have a draft PR out at the moment which only introduces the bare minimum APIs. I'm still working on it at the moment.
https://github.com/juanfont/headscale/pull/1792
While it is rather simple to set and get the policy via the APIs, we currently can load the ACLs using a file. I don't know how to handle both cases where it may be possible to load the ACL from the file, store it in the DB, and serve it to the API consumers. However, changes made to the policy via the APIs may not be written back to the file (if the policy was initially loaded from a file). This will make things inconsistent.
Considering restarts, it may be possible to refer to the policy stored in the db as the source of truth. But let's say the users update the policy file and expect headscale to use that. Then it will again make things inconsistent.
@kradalby What do you think? Does it make sense to have only one way of updating the policy, i.e. via the API?
... some ideas:
acl2024-02-26_112233.old
I know, docker is not officially supported, still many sysadmins are using it. Editing files inside a docker container is a pain anyway, so it's safer to store in the DB, which would be preserved even if the image gets replaced / updated.
_But in case really considering allowing modifying it via files, HeadScale could: Store the ACL in a simple JSON file, which would be locked during run
There would be an ACL directory, containing 3 files:
Readme_ACL.md
acl.conf.new.sample
acl.conf
(Locked)The Readme would explain everything, including how to rename the .samle
file into:
acl.conf.new
HS would check for this file once every 10 seconds, or watch for "file-change event" on Linux.
acl.conf.new
file present,acl.conf
to acl2024-02-26_112233.old
acl.conf.new
to acl.conf
+ Lock it (so it becomes read-only)error2024-02-26_112233.log
@PizzaProgram Here's how I have thought about the ACL management, so far:
GET /api/v1/acl
PUT /api/v1/acl
The input payload shall be parsed using the existing helper function such that it undergoes the required set of parsing for it to be valid. Only a valid payload will be admitted and it shall overwrite the existing payload. This will ensure that the existing working copy is not corrupted by an invalid payload.
In addition to that, since we are storing it in the DB, we will have access to the updated_at
timestamp.
I think we can also incorporate the suggestion of storing the update counter if it'd be helpful.
I think one way to allow file-based configs may be to have commands to get and set the policy which effectively stores it in the DB, thus making the DB the source of truth at all times.
# headscale acl get
# headscale acl set PATH_TO_FILE
These APIs are very much in-line with the original issue description. https://github.com/juanfont/headscale/issues/582#issue-1233829545
As far as I can tell, this sounds perfect.
Although I still recommend strongly to:
It would allow to:
There are millions of cases, where something can go wrong. So keeping at least "one backup" would be crucial.
# headscale acl get-backup
# headscale acl restore-backup
I'm sure these 2 little words could save someone from a disaster, undoing an error with one easy command. (Imagine a 1000+ client config and 1 tiny typing error, or a wrong API-call miss-click. )
it shall overwrite the existing payload.
So instead of overwrite, I recommend to simply:
It is nearly the same amount of work, but opens up many possibilities to work with ACLs in a much safer way.
For example:
# headscale acl show-id // shows the active config ID
# headscale acl get-by-id 5998 // get the whole config
# headscale acl restore-by-id 5996
I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.
the current read-from-disk-on-startup behaviour must be preserved.
I do not think that a Database open / read would affect this in any way at startup.
We are talking about a "secondary" / optional possibility to save/push an ACL config:
@pallabpain Am I summarizing it correctly ?
I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.
@kradalby The PR is still a work in progress. However, I intend to reach a point where both read-from-disk and APIs exist at the same time and that reading from disk too updates the ACLs in the database. That should give us a working prototype to discuss further.
The only thing that bothers me is the co-existence of two update mechanisms that might lead to inconsistencies due to either human error or incorrect usage.
Here's what I aim to achieve in the PR:
@PizzaProgram Yes, storing every modification as an entry is something that we can easily achieve. My only concern with such an implementation is what makes a particular config the last working config. Since the last working config is subjective to headscale users' intended usage we are already making sure that the input payloads are at least validated for correctness.
I believe that instead of storing history in the DB, users can employ any means for maintaining histories and versions, Gitops, etc.
What do you think?
Maybe add a new configuration like
# acl_policy_mode: db|[file]
acl_policy_mode: file
acl_policy_path: /path/to/acl.json
---
acl_policy_mode: db
# acl_policy_path will be ignore
If acl_policy_mode
is file
, all behaviors will keep same as the old version, and the acl api will be unavailable.
If acl_policy_mode
is db
, more new feature can be introduced, including acl api and other cli options?
so we can keep the the current read-from-disk-on-startup behaviour and also add what we want now
@Yoshino-s This can be one way to solve this. We can still let the GET API open which will return the policy. But in file
mode, the PUT API will not set anything.
Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?
@pallabpain Maybe simply ignoring it is a more logical option for users, thinking about db
means data is from db, so the file is nothing about.
I made a draft implement at https://github.com/yoshino-s/headscale . Feel free to use it to merge into your pr, or just do your own :P
I think that the proposed acl_policy_mode
should be respected with regards to the principle of least surprise. Having that be file
by default means that one can maintain backwards compatibility.
Exposing the current ACL (backed by file or database) in the API and blocking modification when backed by file sounds like a good idea.
Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?
I agree with Even on this one, having it go back to file everytime might be quite confusing, a solid toggle doing on or the other makes sense to me.
Making the file available with GET API also sounds reasonable.
Collecting everyone's inputs,
ACL APIs:
Configuration:
CLI:
Collecting everyone's inputs,
ACL APIs:
- Introduce ACL APIs to allow fetch and update operations, and persist in the database.
Configuration:
Users can configure ACLs to reside in a file or the database using a designated attribute.
In the 'file' mode, API access is restricted for ACL updates.
In the 'db' mode, APIs are imperative for dynamic ACL modifications.
CLI:
- Implement CLI commands to get and set the ACLs per the configuration.
@Yoshino-s @kradalby @evenh I have updated my PR considering the above requirements. Please take a look. Thanks. šš¼
Will take a look tomorrow.
After reviewing the ongoing discussion on #1792, I am left with wondering if the implementation could be (optionally) made wire-compatible with the upstream without too much effort.
gitops-pusher.go
has a client implementation that pushes HuJSON files to upstream and is used in CI for implementing the GitOps pattern.
It appears this even accepts a custom apiServer
implementing the expected endpoints.
apiServer
flaghttps://%s/api/v2/oauth/token
applyNewACL
creating a NewRequestWithContext
to https://%s/api/v2/tailnet/%s/acl
for a given tailnet
, invoked by the apply
function.testNewACLs
talking to a validation endpoint at https://%s/api/v2/tailnet/%s/acl/validate
apply
function also invokes getACLETag
, requested from GET
ting HuJSON
from https://%s/api/v2/tailnet/%s/acl
Would the gitops-pusher
maybe provide enough documentation of a suitable API surface for more generic upstream compatibility?
We would probably only have to agree on a default string for the tailnet
variable or make it configurable, e.g. if dns_config.magic_dns: true
is set through dns_config.base_domain
.
This compatibility work could also provide the foundation to implementing a more complete set of endpoints from the Tailscale HTTP v2 API, eventually providing OAuth compatibility for fine-grained access control, as suggested in the previously rejected #1202.
This issue is stale because it has been open for 90 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Where are we now at this progress? Can someone please REOPEN this?
I also have a feeling, the stale bot is doing more harm than good...
I dont think it needs to be reopened, it has been resolved and the code is going into 0.23.0
Feature request Add management of the headscale ACLs via the API. This will allow for ACLs to be updated and current ACLS to be fetched from the headscale instance during runtime.
Aiming to get this started with JSON input and output to start.z
As well as making this a programmable interface for automating ACLs, this will also make integration tests for PR #581 easier to achieve.