Velocidex / velociraptor

Digging Deeper....
https://docs.velociraptor.app/
Other
2.8k stars 469 forks source link

error: frontend: starting frontend: x509: certificate has expired or is not yet valid: #3583

Closed CyberKaizen closed 1 week ago

CyberKaizen commented 2 weeks ago

@scudette We are having an issue where we have restarted our docker stack after it being up for a solid one year and it will now not come back up. The only error we get is: velociraptor: error: frontend: starting frontend: x509: certificate has expired or is not yet valid: current time 2024-06-28T21:53:33Z is after 2024-06-28T00:11:58Z

Server Version: version: name: velociraptor version: 0.7.0-dev commit: ebf996b build_time: "2023-06-14T17:40:19Z" ci_build_url: https://github.com/Velocidex/velociraptor/actions/runs/5270364236 compiler: go1.20.5

We are not sure what is going on here. Any assistance would be appreciated.

Thanks,

scudette commented 2 weeks ago

https://docs.velociraptor.app/knowledge_base/tips/rolling_certificates/

CyberKaizen commented 2 weeks ago

@scudette Is there really no way to get Velociraptor back up without having to reissue the server cert and re-install 5,000 Velociraptor agents?

scudette commented 2 weeks ago

You don't have to reinstall the agents, you just rotate the server cert and restart

CyberKaizen commented 2 weeks ago

@scudette Thanks for your quick response!

I currently can't run the command without the container being up and I'm not sure if I can simply run this command in a staging environment and then copy the server config to production.

Do you have any recommendation on how I could do it without have to make a new docker image?

Any wisdom you could impart to save my bacon would be helpful!

Thanks,

scudette commented 2 weeks ago

You don't have to run it inside the docker container you just need to recreate the config file.

You can use any velociraptor binary using the old config file as described in the KB article. Then simply copy the new config file into the container

CyberKaizen commented 2 weeks ago

Thanks @scudette! Your a life saver!

CyberKaizen commented 2 weeks ago

@scudette Can confirm I got it to work! Containers are back up and running!

The documentation here needs updating: https://docs.velociraptor.app/knowledge_base/tips/rolling_certificates/

If anyone calls the old commands it might useful to tell the user something based on the old arg they tried to use:

predictiple commented 2 weeks ago

The documentation is correct for the versions we expect people to be running. Your version is a year old, which is ancient in terms of how fast Velociraptor development progresses.

You need to upgrade for scores of bug fixes and at least to avoid CVEs: https://docs.velociraptor.app/announcements/2023-cves/

scudette commented 2 weeks ago

The latest release also has a --validity option which can be used to extend the validity past the default 1 year, so for example for 10 years:

velociraptor --config server.config.yaml config reissue_certs --validity 3650  > /tmp/new_config.yaml
CyberKaizen commented 2 weeks ago

The documentation is correct for the versions we expect people to be running. Your version is a year old, which is ancient in terms of how fast Velociraptor development progresses.

You need to upgrade for scores of bug fixes and at least to avoid CVEs: https://docs.velociraptor.app/announcements/2023-cves/

@predictiple For some reason when I used the latest release those commands weren't working for the Darwin version of the executable. Was I doing something wrong causing the argument not to work? Link to the executable I pulled from the releases page: https://github.com/Velocidex/velociraptor/releases/download/v0.72/velociraptor-v0.72.0-darwin-amd64

image

As for the CVE, we do not expose the GUI to the Internet at all. For that reason and also that we had to make a custom fork due to needing more SSO capabilities. We also had to make custom container images, due to issues we had with custom server and client configuration settings not persisting through docker-compose up & down cycles.

predictiple commented 2 weeks ago

I double-checked and the change went in just over a month ago. https://github.com/Velocidex/velociraptor/pull/3500

0.72.3 is the latest version (not 0.72.0). I believe 0.72.4 is imminent.

The custom fork makes sense but keep in mind that Velociraptor development is unusually rapid. In the last few months there have been significant bug fixes, mostly on the server side. You might want to consider automating your build process more so that you don't end up getting as far as 1 year behind.

I suppose we could put a note in that KB article to cater for older versions. But at the same time it seem like a good way to alert to the fact that the version is too old. You're welcome to do a pull request on the docs repo.

scudette commented 2 weeks ago

Thanks for checking @predictiple we probably should not have changed the command name midway through a release cycle.

I do like the option of at least indicating to users the command name has changed if they tried to use the older command but in this case it will only show the hint when trying to run --reissue_keys on 0.72.4 (since we cant go back in time :-) ). Next best thing is to update the docs though.

Another weird thing I noticed is that the commands are reissue_certs and rotate_keys now so they are not really consistant but they do the same thing pretty much so we should probably bring them into line.

predictiple commented 2 weeks ago

Another weird thing I noticed is that the commands are reissue_certs and rotate_keys now so they are not really consistant but they do the same thing pretty much so we should probably bring them into line.

Yes I did consider this at the time. The terms "reissue" and "rotate" are used interchangeably in the crypto world for both keys and certs. The choice of using different terms is intended to make the commands more distinct. Most of the time reissue_certs is what users should be doing, like for example in this situation. The rotate_keys command is more of an "expert mode" option IMHO because it means you should also go through the procedure of backing up/securing keys.

Long-term deployments are an interesting situation. I think it's still relatively rare to have a deployment older than a year since the original use case was as a rapidly deployable IR application. But now it's increasingly common to run it like a permanent infra / EDR, which is all good but the downside is that most EDR solutions have really slow development cycles that have conditioned users into thinking that it's normal to upgrade every 2 years.