bitcoin / bitcoin

Bitcoin Core integration/staging tree
https://bitcoincore.org/en/download
MIT License
79.44k stars 36.34k forks source link

RFC: Simplifying or removing the user agent string (strSubVersion) #21492

Closed practicalswift closed 3 years ago

practicalswift commented 3 years ago

As noted by laanwj in PR 19242 simplifying the user agent would bring some potential privacy benefits:

For what it's worth in web browsers there's currently a trend toward simplifying user agents, creating less variants instead of more. Mostly for privacy reasons. Is signalling this extra information in the user agent useful for the user of the node, or is it simply for statistics for the developers?

MarcoFalke notes some potential security benefits in a comment in the same PR:

Sometimes remote attacks can only be exploited in a specific environment, putting that specific environment into the ua and sending it to the attacker seems counter productive.

And luke-jr in the same PR:

You could argue for removing the UA entirely. But so long as it's there, there's no reason to stop other people from using it as intended. (Nobody is forcing anyone to use this if they don't want to.)

Should we simplify or remove the user agent string (strSubVersion)? Why? Why not?

ghost commented 3 years ago

For what it's worth in web browsers there's currently a trend toward simplifying user agents, creating less variants instead of more. Mostly for privacy reasons. Is signalling this extra information in the user agent useful for the user of the node, or is it simply for statistics for the developers?

Makes sense. But this is optional. Privacy is about things that a user does not want to share with others. Also you could write anything in that comment.

Sometimes remote attacks can only be exploited in a specific environment, putting that specific environment into the ua and sending it to the attacker seems counter productive.

Can also be helpful if fake environment details are shared. Example: TTL in ping can be used to identify OS https://www.blackhat.com/presentations/win-usa-01/Arkin/Briefings/win-01-arkin.ppt https://superuser.com/a/620521/

However lot of servers spoof it so that normal attacks can be avoided as they will try for different OS.

Added uacomment=Windows 95 in bitcoin.conf on Ubuntu

image

You could argue for removing the UA entirely. But so long as it's there, there's no reason to stop other people from using it as intended. (Nobody is forcing anyone to use this if they don't want to.)

Agree

Also there is no option in ABCore app to use remote full node, the app has its own bitcoind so can we use -uacomment ?

image

michaelfolkson commented 3 years ago

It was discussed at the February Sydney Socratic Seminar how the user agent string could be used to display to the network and to peers what soft fork activation mechanism you were running.

Obviously in the best case scenario there would only be one activation mechanism being run on the network but that is unlikely to always be the case. Hopefully with Taproot's proposed Speedy Trial activation mechanism this won't be needed for Taproot. But there will be future soft forks proposed and who knows what tools will be needed for these soft fork activations.

practicalswift commented 3 years ago

It was discussed at the February Sydney Socratic Seminar how the user agent string could be used to display to the network and to peers what soft fork activation mechanism you were running.

Setting aside if such user agent version string based communication is a good idea or not: wouldn't that type of communication be possible also if the default user agent string was say the empty string (strSubVersion = "")?

In other words: the subset of users who have decided to communicate something via strSubVersion can do so on an opt-in basis, no? :)

michaelfolkson commented 3 years ago

@practicalswift: It wasn't my idea (to be clear, transcript is anonymized though) but I thought it would be helpful to highlight a speculative use case for it longer term. The empty string wouldn't be as clear or as flexible. Obviously if the arguments for removing the user agent string were strong, this speculative use case probably wouldn't provide much resistance. But I'm not seeing strong arguments for removing it either as of yet.

practicalswift commented 3 years ago

These bits of information are currently given to attackers via the user agent string:

If we were to reduce the amount of informationen provided: which of the above would be good candidates to drop from the user agent string?

Relevant code snippets:

const std::string CLIENT_NAME("Satoshi");
static const int CLIENT_VERSION =
                             10000 * CLIENT_VERSION_MAJOR
                         +     100 * CLIENT_VERSION_MINOR
                         +       1 * CLIENT_VERSION_BUILD;
…
strSubVersion = FormatSubVersion(CLIENT_NAME, CLIENT_VERSION, uacomments);
…
m_connman.PushMessage(…, CNetMsgMaker(INIT_PROTO_VERSION).Make(NetMsgType::VERSION, PROTOCOL_VERSION, …, …, …, …,
            …, strSubVersion, …, …));
rustyrussell commented 3 years ago

This is currently used as a loose indicator on the rate at which clients upgrade, which is a useful thing to know (like https://luke.dashjr.org/programs/bitcoin/files/charts/software.html). I imagine that would simply be substituted for specific feature tests, creating more work for those doing the measurement and not really stopping bad actors who wanted to exploit older nodes.

maflcko commented 3 years ago

I believe the version would be relatively easy to determine in the absence of a ua string. Looking at the tx relay policy, p2p version, the message types supported and the behaviour of them (like delays) should give a good idea what client version the remote is on. However, the operating system shouldn't influence any of those heuristics, which is why I NACKed #19242 . Though, I think it is ok to keep the current ua string as is.

practicalswift commented 3 years ago

@rustyrussell @MarcoFalke I agree that CLIENT_VERSION_MAJOR most likely can be cheaply inferred from other node behaviour in general, but I'm not so sure about CLIENT_VERSION_MINOR and CLIENT_VERSION_BUILD?

Perhaps we're thinking about different attack scenarios. The attack scenario I have in mind is the worst-case scenario of a full RCE.

In the case of an RCE vulnerability I think that the likelihood of successful reliable exploitation is much higher given exact knowledge of CLIENT_VERSION_MAJOR, CLIENT_VERSION_MINOR and CLIENT_VERSION_BUILD compared to knowing only say CLIENT_VERSION_MAJOR. Don't you think? :)