ethereum / py-trie

Python library which implements the Ethereum Trie structure.
MIT License
104 stars 52 forks source link

Add type hints and expose via PEP561 #69

Open pipermerriam opened 6 years ago

pipermerriam commented 6 years ago

Largely copy/pasted from https://github.com/ethereum/py-evm/issues/1398

Background

Type hints allow us to perform static type checking, among other things. They raise the security bar by catching bugs at development time that, without type support, may turn into runtime bugs.

This stackoverflow answer does a great job at describing their main benefits.

What is wrong?

This library currently does not have any type hints.

This needs to be fixed by:

  1. Add the eth-typing>=2.0.0,<3 library as a dependency.
  2. Adding all missing type hints.
  3. Enforcing (stricter) type checking in CI

How

There does exist tooling (monkeytype) to the generation of type hints for existing code bases. From my personal experience monkeytype can be helpful but does still require manual fine tuning. Also, manually adding these type hints does serve as a great boost to the general understanding of the code base as it forces one to think about the code.

  1. Run mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p trie

  2. Eliminate every reported error by adding the right type hint

Because this library supports older versions of python, the type hints will not be able to use the modern python3.6 syntax.

Definition of done

This issue is done when the following criteria are met:

  1. mypy is run in CI

Add a new command to the flake8 environment in the tox.ini file that runs:

mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p py_ecc`
  1. Usage of type: ignore (silencing the type checker) is minimized and there's a reasonable explanation for its usage

Stretch goals

When this issue is done, stretch goals can be applied (and individually get funded) to tighten type support to qualify:

  1. mypy --strict --follow-imports=silent --ignore-missing-imports --no-strict-optional -p trie
6ug commented 6 years ago

I like to start working on this issue, please consider my application. @pipermerriam

pipermerriam commented 6 years ago

@6ug go for it.

Bhargavasomu commented 5 years ago

@pipermerriam I would like to work on this, as there has been no progress till now. One doubt that I had was, is the type of node essentially bytes? If yes, then there is an issue in the following snippet

def get_node_type(node: bytes) -> int:
    if node == BLANK_NODE:
        return NODE_TYPE_BLANK
    elif len(node) == 2:
        key, _ = node
        nibbles = decode_nibbles(key)
        if is_nibbles_terminated(nibbles):
            return NODE_TYPE_LEAF
        else:
            return NODE_TYPE_EXTENSION
    elif len(node) == 17:
        return NODE_TYPE_BRANCH
    else:
        raise InvalidNode("Unable to determine node type")

Here we have key, _ = node. Then the type of key would become int which is not right for decode_nibbles, as this requires the argument to be explicitly bytes type.

pipermerriam commented 5 years ago

No, I believe that node can be any of the following (you should confirm as it's been a while since I was in this codebase)

Probably good to define these once and re-use them across the codebase.

Also, you've got the :+1: from me to work on this.

Bhargavasomu commented 5 years ago

@pipermerriam @cburgdorf I started type hinting the codebase, but it turned out ugly and I feel that we could refactor the code with the usage of more classes like LeafNode, ExtensionNode, BranchNode. Also I was thinking that since the trie is used to represent the global state of the blockchain (as per my understanding, please correct me if I am wrong, still a noob :P), why don't we add benchmarks?

I would like to refactor the code first as part of another issue, and till then maybe this issue can be blocked. I would need help from you during the refactoring phase. Am I good to go for refactoring?

Bhargavasomu commented 5 years ago

And one more doubt I had is the following

class Trie(HexaryTrie):
    def __init__(self, *args: Any, **kwargs: Any) -> None:
        warnings.warn(DeprecationWarning(
            "The `trie.Trie` class has been renamed to `trie.HexaryTrie`. "
            "Please update your code as the `trie.Trie` class will be removed in "
            "a subsequent release"
        ))
        super().__init__(*args, **kwargs)

Any updates or instructions based on the above code? Like should there be only HexaryTrie in the code?

pipermerriam commented 5 years ago

@Bhargavasomu see this guide for a "better" way to link to existing code:

https://help.github.com/articles/creating-a-permanent-link-to-a-code-snippet/

pipermerriam commented 5 years ago

The warning code you linked above is us maintaining backwards compatibility back a few months ago when it looked like we'd have a separate binary trie implementation in the codebase.

We can probably remove that support, though it will require us to do a major version bump (something you don't have to worry about). However, I don't see us needing to remove it other than for cleanup reasons so I'd suggest not getting sidetracked with that unless you have a specific reason for doing it.

pipermerriam commented 5 years ago

@Bhargavasomu I think it could be an improvement to have the different node types have concrete classes. The devil is going to be in the details/implementation. I'd suggest looking into use of named tuples for this. In general, I support this concept and in general, cleanup of this codebase which hasn't been touched much in the last 1.5 years.

I'm hesitant to commit a bounty to this at this stage since it isn't clear yet whether the effort will produce the desired improvement. But, If this does work out well, I will work to get you compensated in some way (as I know you're a bit of a bounty hunter) and I like seeing people paid for good work.

pipermerriam commented 5 years ago

@Bhargavasomu if you choose to undertake this task, can you open a new issue that provides a high level outline for your plan and we can track progress there.

Bhargavasomu commented 5 years ago

@pipermerriam will come up with a good detailed implementation detail and open an issue in a couple of days.

Bhargavasomu commented 5 years ago

@pipermerriam is there any documentation available apart from README, which would help me progress better with the codebase?

pipermerriam commented 5 years ago

Not really. :cry:

This codebase was written by 1) referencing the yellow paper which is very cryptic and not easy to read and 2) using the implementation that was already done in the pyethereum codebase which is also not going to be any easier to read.

Bhargavasomu commented 5 years ago

@pipermerriam will go through the yellow paper.