stratum-mining / stratum

stratum
https://stratumprotocol.org
Other
223 stars 127 forks source link

Refactor: `codec-sv2` crate #1129

Open rrybarczyk opened 2 months ago

rrybarczyk commented 2 months ago

Background

This task is an outcome of the protocols Rust docs issues tracked in #845.

While documenting protocols::v2::codec-sv2 in #1012, areas of potential code debt were identified. This issue servers as a place to list out these items to then be addressed in an organized manner. The initial Rust documentation effort was an immediate action, while a refactoring (which implies breaking API changes) is not so urgent priority, so for now we should leave this in the backlog for an appropriate moment in the future.

Note: This issue will replace the #873 discussion.

Identified Potential Code Debt

Error Variant Naming

Suggestions for codec_sv2::error::Error and ::CError refactor:

Uniform Spelling

Imports

Exports

Use of Result types

The following methods use core::result::Result but should use codec_sv2::Result (unless there is a reason I am not seeing that requires using core::result::Result):

Uniform Struct Fields

Misc

plebhash commented 2 months ago

we should seriously consider merging codec_sv2 + framing_sv2 crates

https://github.com/stratum-mining/stratum/discussions/873#discussioncomment-10660503


perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

Fi3 commented 2 months ago

we should seriously consider merging codec_sv2 + framing_sv2 crates

#873 (comment)

perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

rrybarczyk commented 2 months ago

we should seriously consider merging codec_sv2 + framing_sv2 crates #873 (comment) perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates.

Unless there is a reason to not combine these two crates, we should plan on this refactor. @fi3, is there any reason not to combine that crates that you know of?

Fi3 commented 2 months ago

we should seriously consider merging codec_sv2 + framing_sv2 crates #873 (comment) perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates.

Unless there is a reason to not combine these two crates, we should plan on this refactor. @Fi3, is there any reason not to combine that crates that you know of?

I just don't see an advantage that justify the work of putting them together. What is this advantage? Why we should spend time doing that?

rrybarczyk commented 2 months ago

we should seriously consider merging codec_sv2 + framing_sv2 crates #873 (comment) perhaps also binary_sv2, although I still need to spend more time on that one to form a better opinion

why?

These crates are closely linked. To my understanding, you never use one with out the other. For this reason, we should combine them. Currently with them separate, you constantly have to cross reference between these two crates. Unless there is a reason to not combine these two crates, we should plan on this refactor. @Fi3, is there any reason not to combine that crates that you know of?

I just don't see an advantage that justify the work of putting them together. What is this advantage? Why we should spend time doing that?

Now is the time where we are really scrutinizing these crates to solidify them moving forward. Can you help me understand by giving me an example of when one of these crates would be used without the other?

plebhash commented 2 months ago

We need to make sure the low-level APIs are lean and easy to use. Right now, they are extremely convoluted. There's no way to understand how to use one single crate without looking for references in multiple other crates.

Maybe this is a blindspot for you @Fi3 , because the APIs are fresh in your head and you can navigate them easily (since you wrote most of them).

But for us, even while just documenting these crates, it has been extremely challenging to understand how everything is meant to be used together.

Our mission here is to make sure the codebase can be easily used by the entire ecosystem, not just one single pool.

Fi3 commented 2 months ago

IMHO is lot easier to make a crate that export everything and is well documented rather then refactor all the crates

Fi3 commented 2 months ago

https://github.com/demand-open-source/share-accounting-ext/blob/master/Cargo.toml#L15C22-L15C23

here for example I need framing but not codec

Fi3 commented 2 months ago

also low level crates are not supposed to be used but we should use an higher level library designed to be easy to use ecc ecc. The low level crates should be used only in special occasion where you really need them, for example for ffi

rrybarczyk commented 2 months ago

Ok, I will take a look at that extension and dig deeper into the code. Thank you.

Fi3 commented 2 months ago

ext is just an example main point is

IMHO is lot easier to make a crate that export everything and is well documented rather then refactor all the crates

jbesraa commented 2 months ago

I dont have a good input about merging/not merging. I will only share that as a new dev in the team, whenever I saw a crate under protocols folder, I immediately thought that there is a protocol specification for the crate, living in sv2-specs. Now I know this is not the case, but I do think it is miss-leading.

Fi3 commented 2 months ago

having several low level crates have the main advantage of keeping the lib as small as possible and easier to test and to review. framing-sv2 is very small and I agree that could be merged, I would put it in binary-sv2 not in codec. I also think that is more than fine like that, and that we shouldn't waste our time and energy in merging it.

plebhash commented 2 months ago

I can definitely entertain @Fi3's argument about effort. Whenever we start planning the scope how we will mitigate technical debt, I fully agree that it is very important for us to evaluate the trade-offs of all proposed changes.

And for sure, there will absolutely be cases where the benefits will not justify the efforts.

I'm not saying this will be the case here. I don't have a clear perspective around that yet, especially since we still have not fully studied the entire scope of protocols.

To be honest, this discussion is still somewhat premature (which is my fault, since I brought up this topic). After we cover 100% protocols crates, we will be in a much better position to have these debates.

What I said about potentially merging framing_sv2 + codec_sv2 was just a random thought that occurred to me while I reviewed https://github.com/stratum-mining/stratum/pull/1040.

It's actually quite interesting that @Fi3 said that it would probably be better to merge framing_sv2 into binary_sv2 instead of codec_sv2. I will tag @Shourya742 here so he is aware, but with a careful warning that his mindset while reading this should be to simply to entertain this possibility while he documents and reviews binary_sv2/no-serde.

For now, at least in my mind, all options are on the table. The main point is that it is very important that we do our best to learn protocols crates and really become experts at them, just like @Fi3 is. And maybe at the end of that journey, we could even come to the conclusion that everything is fine the way it is.

Or maybe not. Time will tell. 🧘

plebhash commented 1 month ago

on the top issue description @rrybarczyk suggested:

Uniform Spelling

  • ...
  • We have decoder::WithNoise and decoder::WithoutNoise. Then we have encoder::NoiseEncoder and encoder::Encoder. Should we make it decoder::WithNoise, decoder::WithoutNoise, encoder::WithNoise, and encoder::WithoutNoise? Or decoder::NoiseDecoder, decoder::Decoder, encoder::NoiseEncoder and encoder::Encoder?

I would suggest the following naming:

[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear

pub type Encoder = PlainEncoder;

pub struct EncryptedEncoder { noise_buffer: Buffer, sv2_buffer: Buffer, frame: PhantomData, }

pub struct PlainEncoder { buffer: Vec, frame: PhantomData, }


- `decoder.rs`:
```rust
#[cfg(feature = "noise_sv2")]
pub type Decoder<B, T> = EncryptedDecoder<B, T>;

#[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear
pub type Decoder<B, T> = PlainDecoder<B, T>;

pub struct EncryptedDecoder<B, T> {
    frame: PhantomData<T>,
    missing_noise_b: usize,
    noise_buffer: B,
    sv2_buffer: B,
}

struct PlainDecoder<B, T> {
    frame: PhantomData<T>,
    missing_b: usize,
    buffer: B,
}

It's not necessarily the final code organization, just some pseudo-rust to convey the idea around the keywords:

rrybarczyk commented 1 month ago

on the top issue description @rrybarczyk suggested:

Uniform Spelling

  • ...
  • We have decoder::WithNoise and decoder::WithoutNoise. Then we have encoder::NoiseEncoder and encoder::Encoder. Should we make it decoder::WithNoise, decoder::WithoutNoise, encoder::WithNoise, and encoder::WithoutNoise? Or decoder::NoiseDecoder, decoder::Decoder, encoder::NoiseEncoder and encoder::Encoder?

I would suggest the following naming:

  • encoder.rs:
#[cfg(feature = "noise_sv2")]
pub type Encoder<T> = EncryptedEncoder<T>;

#[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear
pub type Encoder<T> = PlainEncoder<T>;

pub struct EncryptedEncoder<T> {
    noise_buffer: Buffer,
    sv2_buffer: Buffer,
    frame: PhantomData<T>,
}

pub struct PlainEncoder<T> {
    buffer: Vec<u8>,
    frame: PhantomData<T>,
}
  • decoder.rs:
#[cfg(feature = "noise_sv2")]
pub type Decoder<B, T> = EncryptedDecoder<B, T>;

#[cfg(not(feature = "noise_sv2"))] // redundant, but written to make it explicitly clear
pub type Decoder<B, T> = PlainDecoder<B, T>;

pub struct EncryptedDecoder<B, T> {
    frame: PhantomData<T>,
    missing_noise_b: usize,
    noise_buffer: B,
    sv2_buffer: B,
}

struct PlainDecoder<B, T> {
    frame: PhantomData<T>,
    missing_b: usize,
    buffer: B,
}

It's not necessarily the final code organization, just some pseudo-rust to convey the idea around the keywords:

  • Encrypted vs Plain (which depends on noise_sv2 feature flag)
  • Encoder vs Decoder

I am not sold on Plain. We have been using the word "standard" to express unencrypted logic (correct me if I am wrong). So for the unencrypted, I prefer StandardEncoder/StandardDecoder, or just simply Encoder/Decoder.

For the encrypted, we I do like EncryptedEncoder/EncryptedDecoder.

rrybarczyk commented 1 month ago
plebhash commented 1 month ago

We have been using the word "standard" to express unencrypted logic (correct me if I am wrong).

That's not the case. The Standard* aliases are related to Buffer. There's even a StandardNoiseDecoder, which as the name implies, is encrypted. I asked some questions related to these aliases here https://github.com/stratum-mining/stratum/discussions/873#discussioncomment-10663045

tbh I'm not sold on these aliases... maybe we can do something better, but that's a different discussion (hopefully after buffer_sv2 has been properly documented).

Anyways, if Plain* is not good, we can do either Encoder / Decoder or UnencryptedEncoder / UnencryptedDecoder.

rrybarczyk commented 1 month ago

We have been using the word "standard" to express unencrypted logic (correct me if I am wrong).

That's not the case. The Standard* aliases are related to Buffer. There's even a StandardNoiseDecoder, which as the name implies, is encrypted. I asked some questions related to these aliases here #873 (comment)

tbh I'm not sold on these aliases... maybe we can do something better, but that's a different discussion (hopefully after buffer_sv2 has been properly documented).

Anyways, if Plain* is not good, we can do either Encoder / Decoder or UnencryptedEncoder / UnencryptedDecoder.

Ok, lets do that: Encoder/Decoder and EncryptedEncoder/EncryptedDecoder.

jbesraa commented 1 month ago

Wouldn't NoiseEncoder and NoiseDecoder be more consistent with the current naming in the projects?

rrybarczyk commented 1 month ago

Wouldn't NoiseEncoder and NoiseDecoder be more consistent with the current naming in the projects?

I do like that. Thoughts @plebhash?

Shourya742 commented 1 month ago

https://github.com/stratum-mining/stratum/pull/1040#discussion_r1798748428

rrybarczyk commented 1 month ago

When running the unencrypted and encrypted examples, the following warnings occur:

cargo run --example unencrypted
warning: unexpected `cfg` condition value: `with_serde`
  --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:31:11
   |
31 | #[cfg(not(feature = "with_serde"))]
   |           ^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expected values for `feature` are: `buffer_sv2`, `default`, `no_std`, `prop_test`, `quickcheck`, and `with_buffer_pool`
   = help: consider adding `with_serde` as a feature in `Cargo.toml`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration
   = note: `#[warn(unexpected_cfgs)]` on by default

warning: unexpected `cfg` condition value: `with_serde`
  --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:43:11
   |
43 | #[cfg(not(feature = "with_serde"))]
   |           ^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expected values for `feature` are: `buffer_sv2`, `default`, `no_std`, `prop_test`, `quickcheck`, and `with_buffer_pool`
   = help: consider adding `with_serde` as a feature in `Cargo.toml`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration

warning: trait `Variable` is never used
  --> v2/binary-sv2/no-serde-sv2/codec/src/codec/mod.rs:37:11
   |
37 | pub trait Variable {
   |           ^^^^^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: trait `IntoOwned` is never used
 --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:7:7
  |
7 | trait IntoOwned {
  |       ^^^^^^^^^

warning: `binary_codec_sv2` (lib) generated 4 warnings
warning: `binary_codec_sv2` (lib) generated 4 warnings (4 duplicates)
   Compiling codec_sv2 v1.2.1 (/Users/rachelrybarczyk/Development/StratumV2/Dev/stratum/protocols/v2/codec-sv2)
    Finished `dev` profile [optimized + debuginfo] target(s) in 0.61s
cargo run --example encrypted --features noise_sv2
warning: unexpected `cfg` condition value: `with_serde`
  --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:31:11
   |
31 | #[cfg(not(feature = "with_serde"))]
   |           ^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expected values for `feature` are: `buffer_sv2`, `default`, `no_std`, `prop_test`, `quickcheck`, and `with_buffer_pool`
   = help: consider adding `with_serde` as a feature in `Cargo.toml`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration
   = note: `#[warn(unexpected_cfgs)]` on by default

warning: unexpected `cfg` condition value: `with_serde`
  --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:43:11
   |
43 | #[cfg(not(feature = "with_serde"))]
   |           ^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: expected values for `feature` are: `buffer_sv2`, `default`, `no_std`, `prop_test`, `quickcheck`, and `with_buffer_pool`
   = help: consider adding `with_serde` as a feature in `Cargo.toml`
   = note: see <https://doc.rust-lang.org/nightly/rustc/check-cfg/cargo-specifics.html> for more information about checking conditional configuration

warning: trait `Variable` is never used
  --> v2/binary-sv2/no-serde-sv2/codec/src/codec/mod.rs:37:11
   |
37 | pub trait Variable {
   |           ^^^^^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: trait `IntoOwned` is never used
 --> v2/binary-sv2/no-serde-sv2/codec/src/datatypes/non_copy_data_types/mod.rs:7:7
  |
7 | trait IntoOwned {
  |       ^^^^^^^^^

warning: `binary_codec_sv2` (lib) generated 4 warnings
warning: `binary_codec_sv2` (lib) generated 4 warnings (4 duplicates)
warning: methods `into_aesg` and `into_chacha` are never used
  --> v2/noise-sv2/src/cipher_state.rs:26:8
   |
7  | pub trait CipherState<Cipher_: AeadCipher>
   |           ----------- methods in this trait
...
26 |     fn into_aesg(mut self) -> Option<Cipher<Aes256Gcm>> {
   |        ^^^^^^^^^
...
33 |     fn into_chacha(mut self) -> Option<Cipher<ChaCha20Poly1305>> {
   |        ^^^^^^^^^^^
   |
   = note: `#[warn(dead_code)]` on by default

warning: associated items `name`, `hkdf_3`, and `ecdh` are never used
   --> v2/noise-sv2/src/handshake.rs:10:8
    |
9   | pub trait HandshakeOp<Cipher: AeadCipher>: CipherState<Cipher> {
    |           ----------- associated items in this trait
10  |     fn name(&self) -> String;
    |        ^^^^
...
69  |     fn hkdf_3(
    |        ^^^^^^
...
109 |     fn ecdh(private: &[u8], public: &[u8]) -> [u8; 32] {
    |        ^^^^

warning: `noise_sv2` (lib) generated 2 warnings
    Finished `dev` profile [optimized + debuginfo] target(s) in 0.03s