near / nearcore

Reference client for NEAR Protocol
https://near.org
GNU General Public License v3.0
2.34k stars 632 forks source link

[Epic] Restructure nearcore project #1881

Open frol opened 4 years ago

frol commented 4 years ago

My developer experience with nearcore is not great, so I contemplated the improvements that can be made.

The very first thing that is missing for me is a clear project structure. I believe that the best code is no code and the same goes for documentation (this is why I don't want to introduce documentation of the project structure, but rather prefer having an obvious project structure that does not require a lot of time to understand and makes it easy to navigate without a need to memorize all the components).

I don't have enough experience with the existing infrastructure, but here is my first take on it.

The root folder:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

neard/
 libraries/
 tools/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 README.md
 rustfmt.toml
 rust-toolchain

It is still crowded with various config files, but at least the folders have a clear structure. Just for the reference, here is the current project structure:

 .dockerignore
 .gitattributes
 .github/
 .gitignore
 .gitlab-ci.yml
 .pre-commit-config.yaml

async-utils/
 chain/
 conf/ # Would you guess that this is Grafana config? Do we actually need this config in this repository?
 core/
 docs/ # Nice! We have some documentation! Well, there is a single logo.svg file (let's move it to .github/).
 genesis-tools/
 near/
 nightly/ # Nightly what?
 pytest/ # This is a very popular testing framework in Python, right? Nope
 runtime/
 scripts/
 tests/ # Can we move them into `near` <span class="error">[`neard`]</span>?
 test-utils/

ATTRIBUTIONS.md
 Cargo.lock
 Cargo.toml
 CODEOWNERS
 Dockerfile
 LICENSE
 Makefile
 METRICS.md
 package-lock.json # ???!
 README.md
 rustfmt.toml
 rust-toolchain

So, I suggest collapsing those 13 folders into 3 top-level ones:

 neard/
 src/
 tests/
 tests_py/ # rename from /pytest (I have no idea where to put the nightly folder with those two .txt files, yet)
 libraries/
 chain/
 core/
 runtime/
 test-utils/
 utils/
 tools/
 near-loadtester
 near-state-viewer
 near-genesis
 near-deploy # rename from /scripts

The exact naming is yet to be discussed.

The crates structure and naming also needs some love.

MaksymZavershynskyi commented 4 years ago

This is really great that someone is looking into this and raising these concerns, I really liked the comments.

pytest are not just stress tests they are very similar to nearcore/near/tests and nearcore/tests, just written in Python.

So I suggest we do the following renaming:

WDYT?

frol commented 4 years ago

So I suggest we do the following renaming

The suggested renaming sounds reasonable to me. I will apply the suggestion (update my post) tomorrow.

ailisp commented 4 years ago

I agree with @frol and @nearmax comment above.

nearcore/nightly -> nearcore/neard/tests_nightly_py;

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

SkidanovAlex commented 4 years ago

also note that nightly heavily relies on the relative paths, so if we change the dirs, nightly needs to be changed accordingly.

MaksymZavershynskyi commented 4 years ago

This is a webapp that host and run tests_py and above neard_rs tests, i suggest to move it another repo or put it to tools/nightly_test_runner

Let's jut put in into tools directory.

MaksymZavershynskyi commented 4 years ago

Assigning it to @frol so that we have only one assignee as per guidelines.

frol commented 4 years ago

The first steps in this direction are #2511 and #2520.

@SkidanovAlex @nearmax @ilblackdragon Do you have any objections against project refactoring? The code is not going to change, it should be just more clear folder naming towards better navigation around the project. I want to make the following steps (separate PR iterations):

  1. Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.
  2. Move some of the folders to clean up the root of the project and also make it clear what our internal libraries/modules/components are (see the issue description).

My motivation mentioned in the issue description only grows with the time I spend with nearcore. I don't want to have lengthy conversations about this, but it hurts my productivity. It is somewhat related to #2433.

MaksymZavershynskyi commented 4 years ago

I agree, this is the kinda of stuff that slows down everyone and makes our code less appealing to external contributors. We discussed with @fckt that having a folder named "libraries" is kinda weird, given most of our crates are libraries, but on the second looks it looks ok.

lexfrl commented 4 years ago

Rename the crate folders to match the crate names (chain/chain -> chain/near-chain, runtime/runtime -> runtime/near-runtime, core/store -> core/near-store), so you can easily resolve the use statement into the folder name.

I'd suggest the following adjustment. We can have a near folder to hold all the near-* components.

neard
near
  /chain
  /primitives
  /store
  /runtime
  /json-rpc
utils
test-utils
tools
docs
scripts
...
frol commented 4 years ago

@fckt Given the project is nearcore, it implies that everything in the repository is related to near. Thus, "near" as a folder name does not introduce any clarity (the same way I don't really like chain/chain and runtime/runtime). It has its benefit of pulling some of the root folders inside, but utils and test-utils left behind.

lexfrl commented 4 years ago

The idea is to associate crates which are the core components (direct dependencies) of neard process and their dependencies.

Given the project is nearcore, it implies that everything in the repository is related to near.

Right, but not all of crates are the direct dependencies of the neard (or dependencies of these crates), there is a lot of supplementary (or optional) crates. There is a category of crates which are essential for the node operation.

The following crates make up the system and these can be naturally form a list of the core components (approximated):

near-config
near-crypto
near-primitives
near-store
near-runtime
near-chain
near-chunks
near-client
near-pool
near-network
near-jsonrpc
near-telemetry
near-epoch-manager

However, don't really think that this list as an accurate and complete. For example, its questionable if near-jsonrpc or near-telemetry are the core components (since these could be optional for neard - I can imagine that node operators would want to disable it). But the most probably the near-chain, near-store, near-network, near-runtime and near-primitives are the essential ones (not optional in any mode). Note: I don't count artificial execution modes like testing environments.

frol commented 4 years ago

Let me list all the crates we have:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
neard
near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
restaked
runtime-params-estimator
state-viewer
testlib

Let's extract tools:

genesis-csv-to-json
genesis-populate
keypair-generator
loadtester
restaked
runtime-params-estimator
state-viewer

Let's move neard out of scope and list the rest:

near-actix-utils
near-chain
near-chain-configs
near-chunks
near-client
near-crypto
near-epoch-manager
near-jsonrpc
near-jsonrpc-client
near-logger-utils
near-metrics
near-network
near-pool
near-primitives
near-rpc-error-core
near-rpc-error-macro
near-runtime-configs
near-runtime-fees
near-store
near-telemetry
near-vm-errors
near-vm-logic
near-vm-runner
near-vm-runner-standalone
node-runtime
testlib

Everyone is welcome to suggest a clear separation of the packages. Currently, we have:

I find this separation quite reasonable, but it is hard to find a place for things like near-jsonrpc, near-jsonrpc-client, and [hopefully extracted one day] neard-lib.

bowenwang1996 commented 4 years ago

Where does the neard go under this separation @frol ? Also I think we can merge utils and test-utils (those that are not test utils in utils should probably go to tools).

ailisp commented 4 years ago

what is difference between util and tool?

On Wed, Apr 29, 2020 at 4:56 PM Bowen Wang notifications@github.com wrote:

Where does the neard go under this separation @frol https://github.com/frol ? Also I think we can merge utils and test-utils (those that are not test utils in utils should probably go to tools).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/nearprotocol/nearcore/issues/1881#issuecomment-621530534, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADFFFCD2TWHOBH4WXHS2PQLRPC5CRANCNFSM4J5FE2PQ .

frol commented 4 years ago

I propose keeping neard in the root folder since it is the main entry point to the project.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

I think we can get rid of both "utils" and "test-utils" (just move the relevant crates to "common" and "tools").

what is difference between util and tool?

I view them as follows: a tool is something that you use as a standalone piece (i.e. executable), while util is anything that is useful (e.g. a library, a file, a script, and even an executable).

lexfrl commented 4 years ago

As a general methodology, I think if we want to define a category it should be valuable and not tautological. In other words the definition should add a value, be unique and be not ambiguous.

"utils" was just recently introduced (#2511) and it has the only crate (near-actix-utils), and I can see we either extract it to a standalone helper (separate package and repo) or merge into neard-lib (extract neard/src/lib.rs and its dependencies into a separate crate).

neard-lib - makes sense to me! Why it makes sense: besides of the neard itself, we have a lot of other entry points (tests) which would want to reuse some process initialization logic.

lexfrl commented 4 years ago

The "utils" to me is a common code which exists because you don't want to repeat it (and this is a dominating characteristic of it). But in the same time, it's a small enough and don't deserve to evolve into a separate library (yet). In the same time, that code should be not exclusively specific to the project (e.g. it must not use custom types, defined in the project). For the exclusive common code and types of the components, primitives serves good.

Essential logic which covers dedicated pieces of the project spec (and at the same time are not useful separately (excluding artificial purposes like testing)), better to be colocated together. \

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

bowenwang1996 commented 3 years ago

@frol @matklad do we still want to proceed with this?

matklad commented 3 years ago

Yes, I believe there's still room for improvement for project structure, and that this is non-urgent, but important. @miraclx recently make great progress here as well! Let's keep this open.

frol commented 3 years ago

Yeah, we slowly move in this direction:

matklad commented 3 years ago

One suggestion I have is to flatten out the directory layout and keep all rust crates in the one-level deep crates/ directory. In other words, the first list from https://github.com/near/nearcore/issues/1881#issuecomment-621414303 looks pretty reasonable to me, and it nicely solves the "where do I put a thing which doesn't belong anywhere else" problem.

See https://matklad.github.io/2021/08/22/large-rust-workspaces.html for an extended motivation.

(0.8 confidence the end state will be much better, 0.95 confidence that its a big and a somewhat painful change, 0.7 confidence that we should actually do that)