erigontech / erigon

Ethereum implementation on the efficiency frontier https://erigon.gitbook.io
GNU Lesser General Public License v3.0
3.14k stars 1.12k forks source link

Feature: `--datadir.static` #11283

Open yorickdowne opened 3 months ago

yorickdowne commented 3 months ago

Rationale

E3 can place parts of snapshot and temp onto slow disk.

Doing so with symlinks is anywhere from impossible to extremely awkward when running in Docker, depending on exact volume implementation

A --datadir.static (bikeshed exact parameter name to heart’s content) that points to where the slow storage is would be an elegant solution. Similar to Geth ancient and Reth static parameters.

Implementation

Would need to change the strict “all under datadir” parts of the code, introduce a second variable for slow storage, and default it to datadir.

From there the structure remains as is.

AskAlexSharov commented 3 months ago

@yorickdowne it's doable - problem is not all e3 snapshots are slow-disk friendly. i'm about snapshots/domain and snapshots/accesors folder. if add --datadir.static then will need move snapshots/domain and snapshots/accesors somewhere from snapshots folder.

AskAlexSharov commented 3 months ago

geth has --datadir.ancient

awskii commented 3 months ago

I would suppose instead to add new directories:

 - caplin -  beaconblocks
 - blocks - v1 blocks/transactions/headers

Optionally can add idx subdirectory to allow them mount somewhere on fast disks as well.

All snapshot files are static and only can be replaced by bigger ranges, but as Alex said we have hot accessors, domain and mostly cold history, idx (which depends on RPC load).

Splitting datadir seems as unnecessary sophistication. Users can decide by themselves which dir they want to mount on fast or slow or ram disk.

yorickdowne commented 3 months ago

Regarding mounting directories, I am approaching this from a Docker Compose angle, specifically Eth Docker which has to work for users who want separate slow storage and users that don’t.

The way this is done currently for Geth and Reth:

Without a parameter to say “the static / slow files are over here”, this logic breaks.

Symlinks in Docker volumes are a difficult proposition. Host created ones don’t work at all; ones created inside the container should work, but I’ve never tested that.

I’d vastly prefer to do without symlinks and keep my existing logic.

AskAlexSharov commented 3 months ago

@yorickdowne ok, symlinks maybe breaks (not sure why). But you still can describe sub-dirs as different volumes? like:

    volumes:
      - ${XDG_DATA_HOME:-~/.local/share}/erigon:/home/erigon/.local/share/erigon
      - ${E_ANCIENT_DIR:-~/.local/share/erigon/snapshots}:/home/erigon/.local/share/erigon/snapshots
      - ${E_FAST_STATIC_DIR:-~/.local/share/erigon/snapshots}/domain:/home/erigon/.local/share/erigon/snapshots/domain
      - ${E_FAST_STATIC_DIR:-~/.local/share/erigon/snapshots}/accessors:/home/erigon/.local/share/erigon/snapshots/accessors
yorickdowne commented 3 months ago

I end up with three volumes instead of one for normal operation, plus the static/ancient one.

From a user perspective it's cumbersome. It's just not as easy to see where all the data is at. I'd like to keep the user experience as clean as I can for the standard deployment.

I think where I'll leave it is: Having a separate slow/ancient directory is nice to have, not essential. Absent some kind of optional parameter to tell Erigon where the slow storage should be, I think I'll give this a miss.