juanfont / headscale

An open source, self-hosted implementation of the Tailscale control server
BSD 3-Clause "New" or "Revised" License
23.39k stars 1.28k forks source link

Add ACL management via API #582

Closed samson4649 closed 2 months ago

samson4649 commented 2 years ago

Feature request Add management of the headscale ACLs via the API. This will allow for ACLs to be updated and current ACLS to be fetched from the headscale instance during runtime.

Method Purpose Comments
GET Get currently active ACL Data returned as JSON object
PUT Set new active ACL JSON blob provided to overwrite existing

Aiming to get this started with JSON input and output to start.z

As well as making this a programmable interface for automating ACLs, this will also make integration tests for PR #581 easier to achieve.

restanrm commented 2 years ago

Hello ! I love this idea and also wanted to work on this. I tried to make a proposal doc to describe this feature but didn't release it. As discussed in #492 it's not easy. We want to keep config as code principle as much as possible, but if the ACLs are in DB it's not the case. Also, the API modifying files on system is not ideal and would not work on k8sā€¦

samson4649 commented 2 years ago

I've implemented a working update mechanism however im waiting merge for pr #581 (issue #492)

~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:37Z DBG Setting timeout timeout=5000
2022-06-20T07:11:37Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
    "groups": {
        "group:admin": [
            "host-1"
        ],
        "group:boss": [
            "subnet-1"
        ]
    },
    "hosts": {
        "host-1": "10.64.0.2/32",
        "subnet-1": "10.64.0.0/10"
    },
    "acls": [
        {
            "action": "accept",
            "src": [
                "boss"
            ],
            "dst": [
                "boss:*"
            ]
        },
        {
            "action": "accept",
            "src": [
                "admin"
            ],
            "dst": [
                "dev1:*"
            ]
        }
    ]
}
~ # ./headscale acl update --file /etc/headscale/acl_2.hujson 
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:43Z DBG Setting timeout timeout=5000
2022-06-20T07:11:43Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock

~ # ./headscale acl list
An updated version of Headscale has been found (0.16.0-beta4 vs. your current v0.15.0-389-g5d4dcb6-dirty). Check it out https://github.com/juanfont/headscale/releases
2022-06-20T07:11:47Z DBG Setting timeout timeout=5000
2022-06-20T07:11:47Z DBG HEADSCALE_CLI_ADDRESS environment is not set, connecting to unix socket. socket=/var/run/headscale/headscale.sock
{
    "hosts": {
        "host-1": "10.64.0.2/32",
        "lol-net": "69.69.69.0/24",
        "subnet-1": "10.64.0.0/10"
    },
    "acls": [
        {
            "action": "accept"
        }
    ]
}
routerino commented 2 years ago

Any word on what needs to happen to progress now that 16.0 is out? I'm waiting on API access to progress work on an ACL management interface.

restanrm commented 2 years ago

I think the PR #581 is ready to be merged. Someon needs to take the update ACL subject. If I find some time I'll try to tackle it. I still have some issues on the ACL's that I need to handle. Time is hard to find. If someone else want to try on this please do !

kradalby commented 2 years ago

While this is making good progress with a proposal, I will push it to 0.19.0 as a target, we have a lot of stuff for 0.17.0 and want to do a code reorg for 0.18.0.

VaalaCat commented 1 year ago

Is there any progress on this?

kradalby commented 1 year ago

There is currently no planned work to implement this.

Might happen in the future, but we have a lot planned for the time being.

github-actions[bot] commented 12 months ago

This issue is stale because it has been open for 180 days with no activity.

github-actions[bot] commented 11 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

PizzaProgram commented 10 months ago

At the end did someone merged this? Or still waiting after 1.5 years?

samson4649 commented 9 months ago

At the end did someone merged this? Or still waiting after 1.5 years?

Nope.

pallabpain commented 9 months ago

I believe we can write a much simpler implementation by following what Tailscale is doing. https://github.com/tailscale/tailscale/blob/main/api.md#policy-file

The APIs can directly deal with the policy JSON structure and expect the user to fetch and modify it. All we will need to do is store is as jsonb in the DB somewhere and that's it.

If no one is working on this, I'd like to give this a try since it'd be a pretty useful feature for many.

mathmaniac43 commented 8 months ago

@juanfont I think this issue was automatically closed by the bot by mistake. It seems to me like it should be re-opened since there is still interest in this feature and it looks like @pallabpain wants to take a swing at it. Thanks!

pallabpain commented 8 months ago

I have a draft PR out at the moment which only introduces the bare minimum APIs. I'm still working on it at the moment.

https://github.com/juanfont/headscale/pull/1792

While it is rather simple to set and get the policy via the APIs, we currently can load the ACLs using a file. I don't know how to handle both cases where it may be possible to load the ACL from the file, store it in the DB, and serve it to the API consumers. However, changes made to the policy via the APIs may not be written back to the file (if the policy was initially loaded from a file). This will make things inconsistent.

Considering restarts, it may be possible to refer to the policy stored in the db as the source of truth. But let's say the users update the policy file and expect headscale to use that. Then it will again make things inconsistent.

@kradalby What do you think? Does it make sense to have only one way of updating the policy, i.e. via the API?

PizzaProgram commented 8 months ago

... some ideas:

1. IMHO it would not be a big problem, if the ACL changes would be accepted only by the API.

I know, docker is not officially supported, still many sysadmins are using it. Editing files inside a docker container is a pain anyway, so it's safer to store in the DB, which would be preserved even if the image gets replaced / updated.

2. Lock

_But in case really considering allowing modifying it via files, HeadScale could: Store the ACL in a simple JSON file, which would be locked during run

3. In case it would accept a new file edited manually:

There would be an ACL directory, containing 3 files:

The Readme would explain everything, including how to rename the .samle file into:

HS would check for this file once every 10 seconds, or watch for "file-change event" on Linux.

  1. if it detects a acl.conf.new file present,
  2. it tries to load it,
  3. if succeding, it would:
    • rename the old acl.conf to acl2024-02-26_112233.old
    • rename acl.conf.new to acl.conf + Lock it (so it becomes read-only)
  4. on failed loading:
    • create a error2024-02-26_112233.log

4. Security / important:

pallabpain commented 8 months ago

@PizzaProgram Here's how I have thought about the ACL management, so far:

APIS

  1. Returns the current policy JSON
    GET /api/v1/acl
  2. Updates the policy
    PUT /api/v1/acl

The input payload shall be parsed using the existing helper function such that it undergoes the required set of parsing for it to be valid. Only a valid payload will be admitted and it shall overwrite the existing payload. This will ensure that the existing working copy is not corrupted by an invalid payload.

In addition to that, since we are storing it in the DB, we will have access to the updated_at timestamp.

I think we can also incorporate the suggestion of storing the update counter if it'd be helpful.

I think one way to allow file-based configs may be to have commands to get and set the policy which effectively stores it in the DB, thus making the DB the source of truth at all times.

# headscale acl get

# headscale acl set PATH_TO_FILE

These APIs are very much in-line with the original issue description. https://github.com/juanfont/headscale/issues/582#issue-1233829545

PizzaProgram commented 8 months ago

As far as I can tell, this sounds perfect.

Although I still recommend strongly to:

Keep a previous version of the ACL in DB too !

It would allow to:

There are millions of cases, where something can go wrong. So keeping at least "one backup" would be crucial.

# headscale acl get-backup
# headscale acl restore-backup

I'm sure these 2 little words could save someone from a disaster, undoing an error with one easy command. (Imagine a 1000+ client config and 1 tiny typing error, or a wrong API-call miss-click. )

Summarized:

it shall overwrite the existing payload.

So instead of overwrite, I recommend to simply:

It is nearly the same amount of work, but opens up many possibilities to work with ACLs in a much safer way.

For example:

# headscale acl show-id  // shows the active config ID
# headscale acl get-by-id 5998 // get the whole config
# headscale acl restore-by-id 5996
kradalby commented 8 months ago

I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.

PizzaProgram commented 8 months ago

the current read-from-disk-on-startup behaviour must be preserved.

I do not think that a Database open / read would affect this in any way at startup.

We are talking about a "secondary" / optional possibility to save/push an ACL config:

The config will be loaded from the database during startup. No actual "files" involved.

@pallabpain Am I summarizing it correctly ?

pallabpain commented 8 months ago

I think this discussion is going in a generally positive direction, I have not yet had any time to look at the PR, so apologise if this is addressed already, but I would like to add that the current read-from-disk-on-startup behaviour must be preserved. It is a non-negotiable, that does not mean that this feature cannot be added, it just need to find a way to work together with the current behaviour.

@kradalby The PR is still a work in progress. However, I intend to reach a point where both read-from-disk and APIs exist at the same time and that reading from disk too updates the ACLs in the database. That should give us a working prototype to discuss further.

The only thing that bothers me is the co-existence of two update mechanisms that might lead to inconsistencies due to either human error or incorrect usage.

Here's what I aim to achieve in the PR:

  1. APIs to get and set the policy
  2. CLI commands to get and set the policy
  3. Load policy from a file on startup and store in DB (also during reloads)

@PizzaProgram Yes, storing every modification as an entry is something that we can easily achieve. My only concern with such an implementation is what makes a particular config the last working config. Since the last working config is subjective to headscale users' intended usage we are already making sure that the input payloads are at least validated for correctness.

I believe that instead of storing history in the DB, users can employ any means for maintaining histories and versions, Gitops, etc.

What do you think?

yoshino-s commented 8 months ago

Maybe add a new configuration like

# acl_policy_mode: db|[file]

acl_policy_mode: file
acl_policy_path: /path/to/acl.json

---

acl_policy_mode: db
# acl_policy_path will be ignore

If acl_policy_mode is file, all behaviors will keep same as the old version, and the acl api will be unavailable. If acl_policy_mode is db, more new feature can be introduced, including acl api and other cli options?

so we can keep the the current read-from-disk-on-startup behaviour and also add what we want now

pallabpain commented 8 months ago

@Yoshino-s This can be one way to solve this. We can still let the GET API open which will return the policy. But in file mode, the PUT API will not set anything.

Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?

yoshino-s commented 8 months ago

@pallabpain Maybe simply ignoring it is a more logical option for users, thinking about db means data is from db, so the file is nothing about.

I made a draft implement at https://github.com/yoshino-s/headscale . Feel free to use it to merge into your pr, or just do your own :P

evenh commented 8 months ago

I think that the proposed acl_policy_mode should be respected with regards to the principle of least surprise. Having that be file by default means that one can maintain backwards compatibility.

Exposing the current ACL (backed by file or database) in the API and blocking modification when backed by file sounds like a good idea.

kradalby commented 8 months ago

Or we can simply add a disclaimer saying that when there's an ACL file specified in the config, it will override the ACL in the DB on every startup. What do you think?

I agree with Even on this one, having it go back to file everytime might be quite confusing, a solid toggle doing on or the other makes sense to me.

Making the file available with GET API also sounds reasonable.

pallabpain commented 8 months ago

Collecting everyone's inputs,

pallabpain commented 8 months ago

Collecting everyone's inputs,

  • ACL APIs:

    • Introduce ACL APIs to allow fetch and update operations, and persist in the database.
  • Configuration:

    • Users can configure ACLs to reside in a file or the database using a designated attribute.

    • In the 'file' mode, API access is restricted for ACL updates.

    • In the 'db' mode, APIs are imperative for dynamic ACL modifications.

  • CLI:

    • Implement CLI commands to get and set the ACLs per the configuration.

@Yoshino-s @kradalby @evenh I have updated my PR considering the above requirements. Please take a look. Thanks. šŸ™‡šŸ¼

kradalby commented 8 months ago

Will take a look tomorrow.

almereyda commented 7 months ago

After reviewing the ongoing discussion on #1792, I am left with wondering if the implementation could be (optionally) made wire-compatible with the upstream without too much effort.

gitops-pusher.go has a client implementation that pushes HuJSON files to upstream and is used in CI for implementing the GitOps pattern.

It appears this even accepts a custom apiServer implementing the expected endpoints.

Would the gitops-pusher maybe provide enough documentation of a suitable API surface for more generic upstream compatibility?

We would probably only have to agree on a default string for the tailnet variable or make it configurable, e.g. if dns_config.magic_dns: true is set through dns_config.base_domain.

This compatibility work could also provide the foundation to implementing a more complete set of endpoints from the Tailscale HTTP v2 API, eventually providing OAuth compatibility for fine-grained access control, as suggested in the previously rejected #1202.

github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 90 days with no activity.

github-actions[bot] commented 2 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.

PizzaProgram commented 2 months ago

Where are we now at this progress? Can someone please REOPEN this?

vbrandl commented 2 months ago

I also have a feeling, the stale bot is doing more harm than good...

kradalby commented 2 months ago

I dont think it needs to be reopened, it has been resolved and the code is going into 0.23.0