Chia-Network / chips

Apache License 2.0
45 stars 22 forks source link

CHIP-0033: Add additional partial headers #114

Closed felixbrucker closed 4 months ago

felixbrucker commented 7 months ago

This is the corresponding CHIP for https://github.com/Chia-Network/chia-blockchain/pull/17788

danieljperry commented 7 months ago

Thanks, @felixbrucker! This CHIP has been assigned CHIP-33. It is now a Draft. Feel free to leave a review here, or to discuss it in the #chips channel of our Discord.

xearl4 commented 7 months ago

Let me add a few use cases where this would greatly help the pool operator day-to-day:

  1. chia-harvester-version: Very recently, a third-party harvester software was discovered to be not forming blocks properly when also pooling. Users of this harvester thus performed an inadvertent dead-weight attack on the pool. As pool operators, we can currently neither support nor block these users. Exposing the harvester version would fix that. This becomes a more pressing concern once CHIP-22 adoption increases.

  2. chia-farmer-peer-id: The pool reward payout address is set by the farmer process. We regularly have support cases where users are running multiple farmer processes which have different payout addresses configured and then race to update the payout address each time they contact the pool. Users are then wondering why some of their payouts go to a seemingly wrong payout address. chia-farmer-peer-id would allow us to assist users by telling them which farmer process has the wrong payout address configured.

  3. chia-farmer-peer-id: More generally than 2., many of our users run a (maybe somewhat unusual) setup of running farmer and harvester processes on each machine with directly attached disks. For example, a farmer with 10 storage nodes uses a single head node to run a full node and runs one farmer process and one harvester process on each storage node. We currently have no way to discern between the farmer nodes, and thus mostly have to rely on the information sent by the farmer process which contacted us most recently. With the peer ID, we'd have an identifier to associate the information sent by the farmer with, and improve the information shown to pool users. We can properly show version and payout address per farmer peer, for example. We could also aggregate pool stats upwards from harvester to farmer to launcher id.

(If it makes sense to add any of that directly into the CHIP, anyone please feel free to re-use the above as you see fit.)

emlowe commented 7 months ago

Don't you get the IP address of the farmer when it contacts the pool - either in the originating source address or maybe X-Forwarded-For?

One problem is that the harvester may be harvesting plots that are not associated with your pool at all - so the various chia-harvester settings may be sending your information about the farmer that is unrelated entirely to your pool.

xearl4 commented 7 months ago

Don't you get the IP address of the farmer when it contacts the pool - either in the originating source address or maybe X-Forwarded-For?

We do, but that's usually not helpful, as they typically all share the same outgoing public IP.

One problem is that the harvester may be harvesting plots that are not associated with your pool at all - so the various chia-harvester settings may be sending your information about the farmer that is unrelated entirely to your pool.

Yes, that's true for the capacity/plot count headers and is also addressed in the CHIP: "It will make sense to also add capacities and plot counts scoped to the plotnft of the proof of the partial". It'd probably make sense only sending these stats scoped to the plotnft/launcherid of the partial.

emlowe commented 7 months ago

Don't you get the IP address of the farmer when it contacts the pool - either in the originating source address or maybe X-Forwarded-For?

We do, but that's usually not helpful, as they typically all share the same outgoing public IP.

X-Forwarded-For should have the original internal IP address in most cases - it does for me (although in my case it is an IPv6 addy). At least this is my understanding...

emlowe commented 7 months ago

Yes, that's true for the capacity/plot count headers and is also addressed in the CHIP: "It will make sense to also add capacities and plot counts scoped to the plotnft of the proof of the partial". It'd probably make sense only sending these stats scoped to the plotnft/launcherid of the partial.

I might suggest using some RFC language - perhaps saying something like implementations SHOULD (MUST?) only send chia-harvester information for plots that are specific to this pool and not for other plots in use by the same harvester

xearl4 commented 7 months ago

X-Forwarded-For should have the original internal IP address in most cases - it does for me (although in my case it is an IPv6 addy). At least this is my understanding...

X-Forwarded-For is added by HTTP proxies in the chain, and they use the IP they see. If you have a private/NATed IP (as is typical for IPv4 end user sites), all a proxy (or the final HTTP server) will see is the public IP you were NATed to. (The NAT itself operates on a lower layer and doesn't add HTTP headers.)

For IPv6 the situation can be slightly different, if you have a full public subnet (like a /64 or /56) delegated to your end user site and your internal devices actually use that. (I assume that your IPv6 address being visible externally falls under this case.)

felixbrucker commented 7 months ago

Yes, as early already said, the farmer ip is neither suitable nor available for identification of the farmer (i'm assuming that's the point of bringing it up?)

felixbrucker commented 7 months ago

I might suggest using some RFC language - perhaps saying something like implementations SHOULD (MUST?) only send chia-harvester information for plots that are specific to this pool and not for other plots in use by the same harvester

In the chip i just documented the current PRs state, earl has code for scoped capacity/plot counts based on the PR iirc, which can and should be integrated. Whether they are additive (both scoped and unscoped are included) or replacing (unscoped headers are not present) i'll leave up for debate, i'm fine with either. They should have distinctive names tho, indicating that it is the scoped capacity/plot count (which is the case for earls code)

Maybe it makes sense to link it in here as well, and merge into the PR based on what the community prefers (see above).

fizpawiz commented 7 months ago

My concern here is that providing the farmer peer id, this proposal makes it harder for me as a farmer to protect my privacy. Not all farmers want to "show their hand", and make public the total amount of space they are farming. As it is, with the harvester id in the header, I have to run different harvesters to mask my total space: otherwise that value can be used to correlate me across different plotnfts and even across different pools. Running multiple harvesters is well-documented, so that is not an unsurmountable burden, and large farmers generally do that anyway. But if I now also have to run multiple farmers to avoid being correlated, that is a whole additional level of effort. If this proposal goes through, then people running a normal single-farmer setup will suddenly be completely exposed for the amount of space they have.

I think privacy should be the default. We shouldn't be causing people to automatically give out information that can can be used to identify them. Maybe there are other approaches that could accomplish the same thing on an opt-in basis? Something as simple as a user-settable farmer "name" that is only returned if set? Then people can choose to allow their farmers to be identified if they want to. Ideally we'd also replace harvester id with a similar setup, but I suspect that's no longer possible at this point.

felixbrucker commented 7 months ago

@fizpawiz

Not all farmers want to "show their hand", and make public the total amount of space they are farming.

Is this referring to the estimated space on pool A and B with different plot nfts but same farmer being able to be matched, assuming the farmer id is publicly accessible through both pools?

If this proposal goes through, then people running a normal single-farmer setup will suddenly be completely exposed for the amount of space they have.

To be fair users of a normal setup generally do not split their harvesters to mask their capacity on different pools, and as such are already "completely exposed".

We shouldn't be causing people to automatically give out information that can can be used to identify them.

We can already identify the pseudonym by launcher id and harvester id, farmer id is just another part of the already established chain of software components in a farming setup which is not identified yet.

Maybe there are other approaches that could accomplish the same thing on an opt-in basis? Something as simple as a user-settable farmer "name" that is only returned if set?

Having a name is confusing, because a name is not an identifier. If the concern is that the farmer id is matchable across pools why not combine it with a part of the plotnft, then it is scoped per plotnft but unique per farmer. We just need to make sure some part of it is still easily identifiable/matching the farmer peer id for humans, so users can match a pool side farmer to a physical farmer and give it an appropriate name.

Ideally we'd also replace harvester id with a similar setup, but I suspect that's no longer possible at this point.

That would be a breaking change in the pooling protocol

xearl4 commented 7 months ago

My concern here is that providing the farmer peer id, this proposal makes it harder for me as a farmer to protect my privacy. Not all farmers want to "show their hand", and make public the total amount of space they are farming.

Let me recap, just to make sure I understand the scenario correctly: a person farms a certain amount of space and wants to mask the total amount. So this person splits this space over several different plotnfts (to circumvent the obvious matching by launcher id) and also uses multiple different harvesters (to circumvent matching by harvester id). That person may also distribute the different plotnfts over different pools.

For this case, the proposed change would clearly be a change in identifiability. The person would have to adapt their setup to also use multiple different farmers, in analogue to the harvesters, to circumvent matching by farmer id.

Did I get that right?

fizpawiz commented 6 months ago

Did I get that right?

Yes, exactly. Thank you for being so much more succinct.

fizpawiz commented 6 months ago

Is this referring to the estimated space on pool A and B with different plot nfts but same farmer being able to be matched, assuming the farmer id is publicly accessible through both pools?

Yes.

To be fair users of a normal setup generally do not split their harvesters to mask their capacity on different pools, and as such are already "completely exposed".

True, small farmers probably don't care. But multiple harvesters is very common on >100TiBe farms.

We can already identify the pseudonym by launcher id and harvester id, farmer id is just another part of the already established chain of software components in a farming setup which is not identified yet.

Right! The last part of that chain of software components is the node id, at which point the farmer would be completely identified. And to avoid identification, the farmer would then have to run completely separate farms, which would be a real pain.

Having a name is confusing, because a name is not an identifier. If the concern is that the farmer id is matchable across pools why not combine it with a part of the plotnft, then it is scoped per plotnft but unique per farmer. We just need to make sure some part of it is still easily identifiable/matching the farmer peer id for humans, so users can match a pool side farmer to a physical farmer and give it an appropriate name.

I think the idea you're proposing is to concatenate the plotnft id and the farmer id, then hash that? I agree that would work.

That would be a breaking change in the pooling protocol

I know. And that makes it impossible. I don't know what requirement prompted it to be included originally, but I would have proposed not including it, if I had been aware at the time.

felixbrucker commented 6 months ago

True, small farmers probably don't care. But multiple harvesters is very common on >100TiBe farms.

Yes, but the reasoning for multiple harvesters is generally not to mask their capacity on different pools but to split resources/hdd access

fizpawiz commented 6 months ago

Yes, but the reasoning for multiple harvesters is generally not to mask their capacity on different pools but to split resources/hdd access

Agreed! I think we're saying the same thing on this point. :-)

xearl4 commented 6 months ago

Thanks for clarifying, @fizpawiz.

So a suggested way forward:

fizpawiz commented 6 months ago
  • Send a "masked" farmer ID calculated as hash(launcher id | farmer peer id) instead of the plain farmer peer ID. It's important to make this masked ID easily retrievable via GUI and CLI to allow users to look up how local farmer processes are identifying to a pool.

I like this idea, personally. Would love to do have the option to do the same with the harvester id.

  • Add a pool_send_farm_stats config option, which defaults to true. If enabled (the default), basic summary statistics are sent to the pool with each partial. If disabled, no statistics are sent with partials.

I'm still confused why anyone wants the stats. Self-reported values can be manipulated, especially when the code reporting them is open-source. What happens when someone tweaks their rig to report more space than they have, then falsely claim the pool is ripping them off, pointing to the statistics as "proof"? Or tweaks to under-report space, and then claims to have found some "weird trick" to improve returns? "I'll tell you the trick for just $99!" The partials are the proof of space, which can be trusted. If the reported space is used for ranking farmers on the pool leaderboard, you can be sure people will "adjust" their reported space to move up. Now the pools will need to add code to catch people faking their reported space, and decide what to do when they detect it. And what if it is close, and so it is hard to be sure? Most pools take 1%, so a small tweak by an unscrupulous large farmer can really make the pool look bad, or good, at the whim of the farmer. And accusations would be difficult, or impossible, to prove one way or the other, making such accusations in public potentially very sticky from a PR perspective.

  • If statistics are sent with a partial, these statistics are scoped / limited to the harvester and plotnft of the partial.

Yes, that would be appropriate and important if sending statistics.

felixbrucker commented 6 months ago

I'm still confused why anyone wants the stats

There are many reasons, for example users want to see how their farm performs compared to its actual capacity, users want to see/get notified if their plot count changes/drops, users want to monitor (re)plotting progress .. etc

If you don't need it you can disable it, not a problem.

Self-reported values can be manipulated, especially when the code reporting them is open-source. What happens when someone tweaks their rig to report more space than they have

Nothing, nothing happens as you said its a self reported metric, purely informational to the user

If the reported space is used for ranking farmers on the pool leaderboard

Nobody said anything of the like, as you said its a self reported metric that can not be trusted, why would anyone use it for leaderboards

so a small tweak by an unscrupulous large farmer can really make the pool look bad, or good, at the whim of the farmer

pools use effort to track their performance, not unverifiable user reported metrics, so no he can not, he can only make his own farm look good or bad

And accusations would be difficult, or impossible, to prove one way or the other, making such accusations in public potentially very sticky from a PR perspective.

Nothing changes to how it is right now with the exception that the farmer in question would screenshot his pool harvester page instead of his chia gui to show his "real" capacity

xearl4 commented 6 months ago

I'm still confused why anyone wants the stats.

The stats can be extremely useful in assisting farmers to analyze their farms. As a matter of fact, besides the obvious thing pools do (smoothen payout, spread risk), providing farmers insights into their farm is a major feature of many Chia pools. This serves a very similar purpose to a farm monitoring setup based on Prometheus and Grafana. As such, the farmer-sent stats have purely informational value.

Basically it's the difference between showing a user solely their estimated space: image

Or also being able to show them their actual effective space for reference: image

We have heard from many users that they'd like to have that feature. We also heard from many former Flexpool users that that's one thing they liked in particular about Flexpool's proprietary flexfarmer.

felixbrucker commented 5 months ago

What are the next steps for this? It has been a while since the last update. Can we review/merge the PR in chia-blockchain now?

fizpawiz commented 5 months ago

I'm still confused why anyone wants the stats.

The stats can be extremely useful in assisting farmers to analyze their farms. As a matter of fact, besides the obvious thing pools do (smoothen payout, spread risk), providing farmers insights into their farm is a major feature of many Chia pools. This serves a very similar purpose to a farm monitoring setup based on Prometheus and Grafana. As such, the farmer-sent stats have purely informational value.

...

We have heard from many users that they'd like to have that feature. We also heard from many former Flexpool users that that's one thing they liked in particular about Flexpool's proprietary flexfarmer.

Makes sense to me. I worry about pools using this self-reported info, but that's up to them to protect from users gaming the pool by tweaking the stats. I think we landed on a plan forward, @xearl4 so clearly described above. Would love to also have the option to mask the harvester id too. I don't actually know the CHIP process very well. What is the next step, and who takes it?

danieljperry commented 5 months ago

What is the next step, and who takes it?

At this point it sounds like we have general consensus from the community (other than potentially masking the harvester ID), and we have an implementation in place. I'll move the CHIP to Review.

I'll leave the CHIP in Review for at least a week, and ask for any additional feedback from the community. If everything is looking good, I can move the CHIP into Last Call and we can work on getting the PR#17788 merged.

Assuming no additional changes are needed after two weeks in Last Call, I can move the CHIP to Final. The PR will then be included when we cut the next release branch, so it will be slated for the next release.

Best case scenario, this CHIP is finalized and the implementation is in main three weeks from now. This is subject to change pending community feedback.

danieljperry commented 5 months ago

This CHIP is now in Review. Please leave your reviews here.

fizpawiz commented 5 months ago

Thanks for clarifying, @fizpawiz.

So a suggested way forward:

  • Send a "masked" farmer ID calculated as hash(launcher id | farmer peer id) instead of the plain farmer peer ID. It's important to make this masked ID easily retrievable via GUI and CLI to allow users to look up how local farmer processes are identifying to a pool.
  • Add a pool_send_farm_stats config option, which defaults to true. If enabled (the default), basic summary statistics are sent to the pool with each partial. If disabled, no statistics are sent with partials.
  • If statistics are sent with a partial, these statistics are scoped / limited to the harvester and plotnft of the partial.

I think the above is great. @felixbrucker - is there any way we can incorporate these changes into the chip and PR? I can't write the code. The CHIP changes are small, AFAICT.

felixbrucker commented 5 months ago

@fizpawiz I have added the config options and their spec to the chip

danieljperry commented 5 months ago

Thanks @felixbrucker ! @felixbrucker and anyone else interested: Please leave any additional feedback here. If we have agreement that this CHIP is ready, we will begin the process of finalizing it and adding it to the code base soon.

danieljperry commented 5 months ago

This CHIP is now in Last Call. If no further changes are required in the next two weeks, it will be moved to Final.

Rigidity commented 5 months ago

@felixbrucker it looks like it was suggested to mask the farmer and harvester ids as well, for privacy. Is this something that you plan to add to the implementation PR, or do you disagree with it?

felixbrucker commented 5 months ago

@felixbrucker it looks like it was suggested to mask the farmer and harvester ids as well, for privacy. Is this something that you plan to add to the implementation PR, or do you disagree with it?

I have added an option to obfuscate the farmer id, obfuscating the harvester id is not scope of this chip and pr. I'd suggest creating another chip for that if desired.

danieljperry commented 5 months ago

I have added an option to obfuscate the farmer id

How does the farmer go about obfuscating their ID? For reviewing purposes, can you point the line of code where this is done?

felixbrucker commented 5 months ago

I have added an option to obfuscate the farmer id

How does the farmer go about obfuscating their ID? For reviewing purposes, can you point the line of code where this is done?

They would set this config option to true: https://github.com/Chia-Network/chips/pull/114/files#diff-12eb9b5cbba4cea515e49dc108424ef6a5d61af2b55dc4398356d118fed76371R48

The original PR has not been touched yet, i am waiting for this chip to be finalized before making any more changes to the original PR

danieljperry commented 5 months ago

As you point out, the CHIP and the PR can be completed separately -- think of the CHIP as the spec, and the PR as the implementation. As it stands, the CHIP is in Last Call and will remain there for around 12 more days.

We are going to cut a new branch for the next release soon, so if you would like us to review the PR and add it to the next build, you will need to update it prior to us cutting the branch. If, instead, you want to wait for the CHIP to be finalized before making further updates to the PR, then that's also fine. Your PR will then get added to the following release.

felixbrucker commented 5 months ago

Yes it has already been more than a year since i wanted to add this originally, so a few more days is fine 👍

justinengland commented 5 months ago

I have added an option to obfuscate the farmer id

How does the farmer go about obfuscating their ID? For reviewing purposes, can you point the line of code where this is done?

They would set this config option to true: https://github.com/Chia-Network/chips/pull/114/files#diff-12eb9b5cbba4cea515e49dc108424ef6a5d61af2b55dc4398356d118fed76371R48

The original PR has not been touched yet, i am waiting for this chip to be finalized before making any more changes to the original PR

IMO this setting should be true by default. The user should have to opt into doxxing themselves.

danieljperry commented 5 months ago

@felixbrucker I must apologize -- I think there was some confusion about what the request was. The CHIP currently has a boolean called farmer.obfuscate_farmer_peer_id_in_partials, which defaults to false. Meaning that the farmer must actively change this boolean to true in order to obfuscate their identity. The request is to make the default setting true.

felixbrucker commented 5 months ago

I disagree with this, it is confusing for users because they already know how to identify peers through the peer ids and obfuscating the farmer peer id here while keeping everything else the same is unexpected

danieljperry commented 5 months ago

I disagree with this, it is confusing for users because they already know how to identify peers through the peer ids and obfuscating the farmer peer id here while keeping everything else the same is unexpected

I'm a bit confused by this (I'm not well versed in the pool protocol). If peers can already identify each other, then why do we need a parameter to obfuscate the farmer ID? I thought the intention of the recommendation to obfuscate by default was to make sure this CHIP doesn't decrease privacy. Are you suggesting that with obfuscation enabled, this CHIP actually increases privacy? So if obfuscation is disabled, this CHIP does not raise or lower the privacy that already existed without this CHIP?

felixbrucker commented 5 months ago

No i am referring to the ability of users to identify their harvesters on the pool dashboards by the peer id

Then you can assign names for example for easier identification of harvesters. The same will be true for farmers in addition to harvesters once the farmer peer id is included in requests. However if it is obfuscated it will differ from all other places which report the farmer peer id:

As such the pool would show a farmer peer id that is not found on the users machines

This drawback is acceptable for users who wish to obfuscate their farmer peer id, they need to figure out which farmer on the pool is which themselves

justinengland commented 5 months ago

The need for pool users to no be confused and have to enable a setting to not be confused, which can be handled via documentation, and asking people who care about such things to make a config change, does not outweigh our need to default to protecting the privacy of all users, regardless of their pooling status.

IMO for this change to be accepted the privacy setting must default to true. It just goes too far against our ethos to accept such a change.

What concerns do you have aside from your end users convenience that make stripping away privacy for every single user an acceptable tradeoff?

I understand privacy and security may not be concerns of yours, which is fine, but not every user of chia software is you, or your userbase. Many of them live in places where being doxxed can be more than just inconvenient, and might be dangerous.

In this state, I do no think we should or could accept your PR.

felixbrucker commented 5 months ago

What concerns do you have aside from your end users convenience that make stripping away privacy for every single user an acceptable tradeoff?

I don't think this is as big an issue as it is made out to be. There is a very tiny subset of farmers with multiple plotnfts running, and an even tinier subset is paranoid about their farmer peer id. This whole "issue" also does not affect anyone solo farming.

If you (CNI) do not want this we can close the PR

xearl4 commented 5 months ago

our need to default to protecting the privacy of all users, regardless of their pooling status.

Just to put this in perspective: That need was not taken into account by the official pooling protocol, which -- to use your lingo -- doxxes all users by default by sending unmasked harvester IDs. Users concerned about this particular aspect of privacy need to take active measures today, even before this change. This aspect of doxxing will not change with this CHIP.

As mentioned in the discussion above PR, it's also not true that privacy for every single user is stripped away. There is absolutely no change in privacy for a significant portion of users. The only change in privacy with the proposed default setting is for users who (a) farm multiple plotnfts, and (b) farm them with a single farmer for, and (c) farm them with separate harvesters per plotnft (either intentionally or by accident). (And they'll also need to take quite a few more network-level measures to avoid being correlated, but that's irrespective of Chia.) Only farm a single plotnft? No change in privacy. Use one farmer per plotnft? No change in privacy. Use the same harvesters for multiple plotnfts? No change in privacy.

As users concerned with this particular privacy issue currently and in the future need to take completely outsized measures to preserve this shred of privacy, I am somewhat surprised why this tiny aspect is now made up to be a complete blocker.

I think having that setting default to not obfuscating the farmer ID strikes a nice balance between providing easy access to a much-requested feature for pool users and still making it trivially easy for users who know what they are doing to obfuscate their farmer ID with a config setting (remember: they still will need to separate harvesters and use network-level obfuscation in the future).

justinengland commented 5 months ago

I think having that setting default to not obfuscating the farmer ID strikes a nice balance between providing easy access to a much-requested feature for pool users and still making it trivially easy for users who know what they are doing to obfuscate their farmer ID with a config setting (remember: they still will need to separate harvesters and use network-level obfuscation in the future).

How does requiring the users who requested this change to change the config option to true and leaving the "small subset of people" you allude to alone, not also accomplish the same thing?

Overall I think this is a good change, and if you guys say its an often requested change for pool farmers then we should get it into the code base, but the sticking point for me is defaulting to on with new options 1 and 2, defaulting on new options that change privacy that users might be relying on.

danieljperry commented 5 months ago

A summary of this CHIP's status:

Fee free to continue the discussion here. If necessary, we also could try to set up a Zoom call for anyone interested to discuss these settings.

BrandtH22 commented 4 months ago

No i am referring to the ability of users to identify their harvesters on the pool dashboards by the peer id

Then you can assign names for example for easier identification of harvesters. The same will be true for farmers in addition to harvesters once the farmer peer id is included in requests. However if it is obfuscated it will differ from all other places which report the farmer peer id:

* anywhere in the gui where peer ids are shown

* any and all cli commands, especially if they do not even have anything to do with a specific plotnft, as mapping is impossible then

* peer id based on the cert will remain unchanged always

As such the pool would show a farmer peer id that is not found on the users machines

This drawback is acceptable for users who wish to obfuscate their farmer peer id, they need to figure out which farmer on the pool is which themselves

If a user wants to opt in for the ability to name their harvesters (which based on this message seems that will only be a possibility if this integration is in place) then wouldn't the pool be able to provide a GIF on the naming page showing how to unmask their harvester IDs for easier identification?

The options here seem to be:

Obfuscate the ids, negative being that users might need to remove obfuscation to more easily work with pool support and name their harvesters (both of these activities are initiated by the user actively seeking information).

OR

Do not obfuscate the ids, negative being that users might need to add obfuscation if they are concerned about security. These users would need to be aware of the potential security concerns and need to take action based on this.

Seeing how the first options negative is focused on what users are actively doing (seeking pool support or trying to name their harvesters) documentation explaining the solution can be provided where the user is actively engaging already while the second options negatives are entirely passive; I personally, would like to see option 1 exercised.

Furthermore, to just obfuscate the farmer ID would not be sufficient to resolve the negatives in option 2 so I also highly recommend that the CHIP and associated reference client PR be updated to also obfuscate the harvester ID.

felixbrucker commented 4 months ago

If a user wants to opt in for the ability to name their harvesters

Substitute harvester for farmer here and everywhere below, users can already name their harvesters as partials include the harvester id

Furthermore, to just obfuscate the farmer ID would not be sufficient to resolve the negatives in option 2 so I also highly recommend that the CHIP and associated reference client PR be updated to also obfuscate the harvester ID.

As stated before, this is out of scope for this CHIP and PR (additional partial headers). Instead i would recommend the concerned parties to create a new CHIP for (harvester) peer id obfuscation.

BrandtH22 commented 4 months ago

Substitute harvester for farmer here and everywhere below, users can already name their harvesters as partials include the harvester id

I am not sure if this is meant to negate any of the other logic in my previous statement but it seems that if users are already naming their systems then they are already looking at the dashboard and a notice could be added that they need to remove obfuscation when X version releases to maintain the functionality?

Whereas for security concerned individuals there is no active location where they are checking for this information.

As stated before, this is out of scope for this CHIP and PR (additional partial headers). Instead i would recommend the concerned parties to create a new CHIP for (harvester) peer id obfuscation.

I respectfully disagree, as you and xearl have stated harvester IDs are already not obfuscated BUT this chip and associated PR make it so that lack of obfuscation becomes a security concern (as it can now be tied to a farmer ID). If this chip and associated PR are not adopted then there is no change to security but if adopted there is a decrease in security. To resolve this decrease in security it seems imperative to obfuscate both the farmer and harvester IDs.

At the very least obfuscating the farmer ID should most certainly be the default

xearl4 commented 4 months ago

harvester IDs are already not obfuscated BUT this chip and associated PR make it so that lack of obfuscation becomes a security concern

I think that's a misunderstanding. The harvester IDs are already a security concern, as fizpawiz' comments also clearly confirm. A farmer reusing the same harvester(s) to farm plots belonging to different plotnfts already allows pools to match multiple plotnfts to the same farmer. The security concern is already there, it's part of the Chia Pool Protocol 1.0 specification.

felixbrucker commented 4 months ago

I respectfully disagree

Agree to disagree, for me obfuscating the harvester id which is part of the pooling protocol and the json payload is out of scope for a pr adding additional headers.

xearl4 commented 4 months ago

How does requiring the users who requested this change to change the config option to true and leaving the "small subset of people" you allude to alone, not also accomplish the same thing?

It accomplishes a similar thing, but -- in my opinion -- with a lot of wasted effort.

The original PR which lead to this CHIP implemented a few separate things:

  1. Send the version of the harvester process along with each partial (currently, only the farmer process' version is sent). I still believe this is absolutely necessary. Do we need a CHIP to add just this single header? I have the impression that this part is not contentious, and I'd be happy and willing to separate and submit the addition of just this header in a standalone PR. (Edit: I went ahead and submitted just this change as separate PR.)
  2. Add a farmer peer ID header, to allow pools to discern between multiple farmer processes submitting partials and use this to better inform and support to pool users.
  3. Send harvester stats via headers, to enable pools to provide target/actual stats to their users.

Regarding 2 and 3, I get the feeling that users at this level of concern about privacy would generally prefer not to send any new information to the pool, not even the obfuscated IDs which still leak information that was previously not visible (that a farmer is using multiple different farmer processes; even though no one has complained about this information leakage yet.) And this is the unnecessary effort I mentioned initially: why implement obfuscation and whatnot for something that no one will ever use or want to use? If the UX is going to be that we have to tell pool users who like to have this useful information to first enable config options anyway, I'd personally rather have two simple "include farmer peer id" (default false, because newly introduced privacy concern) and "include harvester stats" (default false, because newly introduced privacy concern) config options instead.