favonia / cloudflare-ddns

🌟 A small, feature-rich, and robust Cloudflare DDNS updater
Apache License 2.0
762 stars 33 forks source link

Ability to hide or obfuscate IP addresses in logs #649

Open joaquinrovira opened 9 months ago

joaquinrovira commented 9 months ago

Issue:

Hey there! 👋 I'm currently running favonia/cloudflare-ddns:1.11.0 container on my publicly accessible ArgoCD instance, and the logs display IP addresses. To amp up privacy (especially since it's behind a Cloudflare proxy), I'm suggesting a feature to hide/obfuscate those IPs.

Details:

I'm thinking a simple config option in settings would do the trick. An environment variable like HIDE_IP. Maybe offer methods like replacing the last octet with "XXX" or hashing the IP. Personally, I think IP hashing is preferable in order to maintain observability of IP changes.

🌐 Detected the IPvX address: <SHA_256_IP_HASH>

Note:

Currently loving the tool! 🙌 Thanks for maintaining it. Your work is much appreciated! 🚀

favonia commented 9 months ago

@joaquinrovira Thanks for the idea and I am glad that the updater is working for you! I'm sorry I was offline during Thanksgiving. The updater still reveals too much information for my liking (#603), so thanks for pointing out another possibility.

I do want to understand more about the use cases, though, because IP addresses are not very "private," so to speak. There are many ways to uncover your IP: for example, if you have ever sent an email to a public mailing list or joined a public IRC channel, your IP might have already been recorded permanently. If your MX record ever contains your IP, or if you have once turned off the Cloudflare proxying, someone on the internet might have already permanently recorded that. (Many websites are keeping past DNS records.) The other part is, even if your IP is hidden, there are numerous bots scanning vulnerable servers, proxied or not. Overall, the internet is never designed to keep your IP a secret. Only a few protocols and services (e.g., Tor) try to hide your IP.

Therefore, I would like to understand more about the attacks you are trying to prevent. Currently, sensitive tokens and URLs are hidden so that someone looking over your shoulder cannot (easily) gain access to your account. However, many people can directly know your IP if they are physically that close. The only other case I could think of is copying and pasting your log into a GitHub issue. However, I am not yet convinced by the risks of revealing IPs v.s. the cost of making debugging more difficult.

Hashing is an interesting idea. Nonetheless, because the only case I could think of is to copy your IP into a GitHub issue, I don't think hashing will help---it might be easy to invert the hashing by enumerating plausible IPs, that is, a dictionary attack. In the case of IPv4, you can enumerate all possible addresses in no time.

The last thing I want to point out is that I wonder if you want HIDE_IP to affect messages sent to Healthchecks, Uptime Kuma, and/or the shoutrrr support I am currently adding (still a work in progress). In my opinion, the main difficulty of designing a good interface based on environment variables is to ensure all variables are as independent of each other as possible and all reasonable combinations have an intuitive meaning. This is why I am interested in learning more about your concerns and motivations to find a good design.

In any case, I am happy to implement something to address your concerns (at least after this busy semester), but I might need more information. Thank you!

favonia commented 6 months ago

@joaquinrovira Hi, I'm sorry if my intimidating (?) long comment accidentally shut down the conversation. Please feel free to add anything that you might find useful. Thanks! I am eager to figure out what should be changed to the tool (and implement them probably this summer :star_struck:).

joaquinrovira commented 6 months ago

Sorry for the late response.

I do want to understand more about the use cases, though, because IP addresses are not very "private," so to speak.

I deploy stuff in my homelab using ArgoCD. This instance is publicly accessible just for fun. My servers are behind Cloudflare so my IP is AFAIK not trivially exposed. The issue is that ArgoCD dashboard also gives access to the cloudflare-ddns pod logs. Not terrible, but no ideal. So I was wondering if we could obfuscate those IPs somehow.

The other part is, even if your IP is hidden, there are numerous bots scanning vulnerable servers, proxied or not. Overall, the internet is never designed to keep your IP a secret. Only a few protocols and services (e.g., Tor) try to hide your IP.

Feel free to close the issue if you deem this is to be unnecessary. The idea is to make it slightly harder to get the IP, not impossible.

[...] easy to invert the hashing by enumerating plausible IPs, that is, a dictionary attack. In the case of IPv4, you can enumerate all possible addresses in no time.

This could be solved by salting before hashing. Maybe another env var or just random bytes.

The last thing I want to point out is that I wonder if you want HIDE_IP to affect messages sent to Healthchecks, Uptime Kuma, and/or the shoutrrr support I am currently adding (still a work in progress).

I have not looked into the effects outside my use case. No clue regarding this.

I do not have the disposable time right now but if I do I could try to push a proposal PR.

joaquinrovira commented 6 months ago

Just wanted to add that the program has been running flawlessly for several months now. Thanks again for the effort of building and maintaining this project.

favonia commented 6 months ago

I deploy stuff in my homelab using ArgoCD. This instance is publicly accessible just for fun. My servers are behind Cloudflare so my IP is AFAIK not trivially exposed. The issue is that ArgoCD dashboard also gives access to the cloudflare-ddns pod logs. Not terrible, but no ideal. So I was wondering if we could obfuscate those IPs somehow.

I see... should we just remove the IPs from the logging, then?

joaquinrovira commented 6 months ago

Either one would work. Hashing allows to easily see when the underlying value changes. However, simply hiding would be enough. I would not want to add more complexity than needed.

favonia commented 5 months ago

@joaquinrovira Should we hide Cloudflare record/zone IDs as well?

favonia commented 5 months ago

Proposed Design

Did I miss anything that should be hidden as well? I know this is hiding more than what you requested, but I was trying to brainstorm something that could be useful in more use cases.

favonia commented 5 months ago

The more refined version would be

LOG_OBFUSCATION=ip,timezone
joaquinrovira commented 4 months ago

This proposal would be much more extensive the initial scope. It certainly covers my needs and probably anyone else public logs. (If there is anyone... 😅).

favonia commented 2 months ago

Update: I think my proposal should probably be called LOG_REDACTION because we are planning to hide, not obfuscate, private information.

favonia commented 2 months ago

@joaquinrovira I am thinking about these five kinds of "private" information:

  1. Tokens (token)
  2. IPs (ip)
  3. Domains (domain)
  4. IDs of records and zones (id?)
  5. Timezone (timezone)

By default only tokens are hidden. I think you want token,ip in your case... but I'm a bit reluctant to add all the complexity at once. The special value min shows everything and max hides all. As a starting point only three modes token, min, and max are supported and token is the default. Let me know if max does not work for you.

joaquinrovira commented 2 months ago

Will do! Thank you very much. 😃

It will take me a couple of days at least as I'm AFK this week.

favonia commented 2 months ago

@joaquinrovira I haven't implemented the feature yet! Subscribe to #785 to monitor my (slow) progress.

PS: I felt the timezone information might be too difficult to hide due to various side channels, giving up hiding it :upside_down_face:

favonia commented 2 months ago

@joaquinrovira I have a design problem now: none of the libraries I use (including the Go standard library) were designed with obfuscation in mind, and they often generate very detailed error messages containing "private" information. To 100% block the leakage of IP addresses or other information, error messages outside my control cannot be shown at all, which could make debugging very difficult if something goes wrong.

Are you fine with your IPs usually hidden, but then potentially revealed when something goes wrong?

I also wonder if there's another way to solve your problem. For example, what if you redirect the logging into a file?

joaquinrovira commented 2 months ago

I am okay with errors showing this kind of information. I have not seen any errors while running the application yet.

Furthermore, if errors are logged to stderr one can always redirect the output with2>/var/log/ddns.log.

favonia commented 1 month ago

@joaquinrovira I'm currently going back to the drawing board because implementing this is more challenging than I thought---partially because of the elaborate system to generate various messages. I have a counterproposal: would it be easier to redirect everything into a file with a new configuration? The current coding makes it trivial to redirect logging (originally for testing the message printer). Such as

LOG_OUTPUT=/path/to/log

where the special value - means the standard output. I understand that the downside is you could not view the log directly in the ArgoCD dashboard.

By the way, it might not make much sense to send only errors to stderr because there is no inherent difference between errors and non-errors. The separation is perhaps more meaningful when a program is part of a pipeline to handle data; its standard output would be the input to the next program.