dappnode / DNP_DAPPMANAGER

Dappnode package responsible for providing the Dappnode Package Manager
GNU General Public License v3.0
29 stars 39 forks source link

Automate pruning Eth1 nodes taks #915

Open pablomendezroyo opened 2 years ago

pablomendezroyo commented 2 years ago

Feature request

Description

The task of pruning Geth (and any other Eth1 node) is commonly done by DAppNode users to decrease the disk space used.

Proposal

Allow users to prune Eth1 nodes directly from the UI. This can be easily achieved with a host script. Ideally this task should be performed peridically so users do not have to worry about it.

Resources

dapplion commented 2 years ago

That would be really nice! To prune, does the main geth process have to stop?

alexpeterson91 commented 2 years ago

That would be really nice! To prune, does the main geth process have to stop?

Yes. It needs to be stopped to add the runtime flag to enable pruning which when finished needs to be removed and replaced with normal default runtime flags. so dependent packages such as Prysm Lighthouse and Teku need a fallback ETH1 service defined to keep running while pruning is running. From what I've gathered the process takes roughly 3-4 hours on a normal machine most of our users run. But the complication for pruning automation that I've found is that you need to be parsing the logs to wait for the output saying pruning complete at which point you need to restart the service with the pruning flag removed and the default, or actually whatever other settings the user may have added to the EXTRA_OPTS ENV field. Possible definitely but likely have to Have a script running combing through the logs as they run for hours looking for errors and exceptions thrown or for the completion notification in order for it to resume normal function after being manually run by the user or periodically by a scheduled service.

dapplion commented 2 years ago

@alexpeterson91 After a complete run of pruning, if you restart the process with the prune flag again: does pruning take 3-4 hours again? Or is it much faster?

alexpeterson91 commented 2 years ago

AFAIK it cannot prune when it’s already pruned and it needs to have the flag for the pruning state removed and then be restarted to make geth sync again. It must be stopped for pruning. And restarted at the end of the process.

Sent from my iPhone

On Feb 1, 2022, at 12:48 AM, Lion - dapplion @.***> wrote:

 @alexpeterson91 After a complete run of pruning, if you restart the process with the prune flag again: does pruning take 3-4 hours again? Or is it much faster?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

dapplion commented 2 years ago

Parsing logs is a very sketchy option, it usually breaks after enough time passes, is there any other way to detect it's done? Maybe we can ask Geth to add a feature where when done with pruning it creates a flag file

alexpeterson91 commented 2 years ago

Again AFAIK the only way to know is via the logs because everything else is shut down the apis etc. not sure how you’d know when it’s completed and it needs to complete or it will be corrupted. There were other things I found more recently about needing at least 50gb of open storage for pruning to begin too. I know some group has a script for ethdocker I think but I didn’t look into it so don’t know if it’s just a cron job with timers or it’s more complex. I’ll try and find it again. But I couldn’t find any other mentions of the ability to auto prune geth anywhere.

Sent from my iPhone

On Feb 3, 2022, at 10:57 PM, Lion - dapplion @.***> wrote:

 Parsing logs is a very sketchy option, it usually breaks after enough time passes, is there any other way to detect it's done? Maybe we can ask Geth to add a feature where when done with pruning it creates a flag file

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

alexpeterson91 commented 2 years ago

https://gist.github.com/yorickdowne/3323759b4cbf2022e191ab058a4276b2

This was the only script I could find but it seems to just be that a script that needs human intervention to check logs. So not helpful.

Sent from my iPhone

On Feb 3, 2022, at 10:57 PM, Lion - dapplion @.***> wrote:

 Parsing logs is a very sketchy option, it usually breaks after enough time passes, is there any other way to detect it's done? Maybe we can ask Geth to add a feature where when done with pruning it creates a flag file

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

dapplion commented 2 years ago

https://gist.github.com/yorickdowne/3323759b4cbf2022e191ab058a4276b2 This was the only script I could find but it seems to just be that a script that needs human intervention to check logs. So not helpful. Sent from my iPhone On Feb 3, 2022, at 10:57 PM, Lion - dapplion @.***> wrote:  Parsing logs is a very sketchy option, it usually breaks after enough time passes, is there any other way to detect it's done? Maybe we can ask Geth to add a feature where when done with pruning it creates a flag file — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Check https://github.com/ethereum/go-ethereum/issues/24344#issuecomment-1031096208 what do you think would be best short term?

yorickdowne commented 2 years ago

Interesting. I think RocketPool does it by stopping the service, touching a lock file, then starting the service. Which means that when it finishes, docker will automatically restart it.

https://github.com/rocket-pool/smartnode/blob/482d908de3d79192b9ef10a260c5d85cea35f350/rocketpool-cli/service/service.go#L402

The entrypoint script just looks for that lock file, runs with prune and removes lock file after prune if found, runs geth normally if not.

https://github.com/rocket-pool/smartnode-install/blob/master/amd64/rp-smartnode-install/network/mainnet/chains/eth1/start-node.sh

alexpeterson91 commented 2 years ago

I completely forgot rocket pool was the only other solution for auto pruning I could find in the wild already. Thanks for bringing it back up I totally forgot to dig into their code for auto prune.

pablomendezroyo commented 2 years ago

@yorickdowne thanks for the reference!

So it seems like they run a "prune container" with the ethclient volume attached to it:

The prune container creates this lock file in the ethclient volume. Then the ethclient container is restarted and in its entrypoint is checked for the lockfile, so if it exists then it start prunning and waits for the process to finish:

I like the approach, however, I dont like the idea of having to do tasks on both sides: dappmanager and ethclients. Ideally we want to do everything from the dappmanager

yorickdowne commented 2 years ago

Yeah that’s what they do. I’ve done the same thing now. I don’t really see a way around having some kind of entrypoint for Geth to make this possible. If your dappmanager is always active and able to check things periodically, you could also run prune in a container named Geth-prune and then restart the proper Geth when the prune container is no longer active, checked via docker ps or something like it.

whatever you do will be temporary, as Geth will Soon(tm) no longer need to be offline pruned.

pablomendezroyo commented 2 years ago

@dapplion @3alpha I have three approaches:

  1. entrypoint.sh in eth clients with a flag --prune so when the user wants to prune the dappmanager would have to do docker-compose up package entrypoint + flag
  2. Same system as rocketpool using lock files
  3. trap signals in eth clients containers and send a custom singal from the dappmanager to the ethclient. When caught this signal from the ethclient it will start prunning

Furthermore, we could easily set up a cronjob system to allow users to prune automation for 1 day / 1 week / 1 month that executes

3alpha commented 2 years ago

First one seems most elegant. What do you think?

pablomendezroyo commented 1 year ago

@3alpha we need to retake this! Implement automate prunning for Execution Clients that allow to do it online

3alpha commented 1 year ago

I don't think we should do it automatically, but manually with improved UX. Eventually adding only reminders in dappmanager. Maybe add to each client enum env variable which allows few modes of operation "normal", "pruning" and such.

alexpeterson91 commented 1 year ago

I think some things need auto pruning enabled by default but i definitely agree that it should be user configurable as @3alpha suggested

pablomendezroyo commented 5 months ago

@3alpha should we close this due to the upcoming geth feature for ancient data